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METHODS OF DIAGNOSIS OF AN GIOGENESIS, COMPOSITIONS 
AND METHODS OF SCREENING FOR ANGIOGENESIS 
5 MODULATORS 



CROSS-REFERENCES TO RELATED APPLICATIONS 
This application claims priority to USSN 09/784,356, filed February 14 2001; 
10 USSN 09/791,390, filed February 22, 2001; USSN 60/285,475, filed April 19, 2001, USSN 
60/310,025, filed August 3, 2001, and USSN 60/334,244, filed November 29, 2001, each of 
which is herein incorporated by reference in its entirety. 

FIELD OF THE INVENTION 
1 5 The invention relates to the identification of nucleic acid and protein 

expression profiles and nucleic acids, products, and antibodies thereto that are involved in 
angiogenesis; and to the use of such expression profiles and compositions in diagnosis and 
therapy of angiogenesis. The invention further relates to methods for identifying and using 
agents and/or targets that modulate angiogenesis. 

20 

BACKGROUND OF THE INVENTION 
Both vasculogenesis, the development of an interactive vascular system 
comprising arteries and veins, and angiogenesis, the generation of new blood vessels, play a 
role in embryonic development. In contrast, angiogenesis is limited in a normal adult to the 

25 placenta, ovary, endometrium and sites of wound healing. However, angiogenesis, or its 

absence, plays an important role in the maintenance of a variety of pathological states. Some 
of these states are characterized by neovascularization, e.g. 9 cancer, diabetic retinopathy, 
glaucoma, and age related macular degeneration. Others, e.g., stroke, infertility, heart 
disease, ulcers, and scleroderma, are diseases of angiogenic insufficiency. 

30 Angiogenesis has a number of stages (see, e.g. , Folkman, J.Natl Cancer Inst 

82:4-6, 1990; FiresteinVC/m Invest 103:3-4, 1999; Koch, Arthritis RheumAl:95l-62, 1998; 
Carter, Oncologist 5(Suppl l):51-4, 2000; Browder et al y Cancer Res. 60:1878-86, 2000; and 
Zhu and Witte, Invest New Drugs 17: 195-212, 1999). The early stages of angiogenesis 
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include endothelial cell protease production, migration of cells, and proliferation. The early 
stages also appear to require some growth factors, with VEGF, TGF-oc, angiostatin, and 
selected chemokines all putatively playing a role. Later stages of angiogenesis include 
population of the vessels with mural cells (pericytes or smooth muscle cells), basement 
5 membrane production, and the induction of vessel bed specializations. The final stages of 
vessel formation include what is known as "remodeling", wherein a forming vasculature 
becomes a stable, mature vessel bed. Thus, the process is highly dynamic, often requiring 
coordinated spatial and temporal waves of gene expression. 

Conversely, the complex process may be subject to disruption by interfering 

1 0 with one or more critical steps. Thus, the lack of understanding of the dynamics of 

angiogenesis prevents therapeutic intervention in serious diseases such as those indicated. It 
is an object of the invention to provide methods that can be used to screen compounds for the 
ability to modulate angiogenesis. Additionally, it is an object to provide molecular targets for 
therapeutic intervention in disease states which either have an undesirable excess or a deficit 

15 in angiogenesis. The present invention provides solutions to both. 

SUMMARY OF THE INVENTION 
The present invention provides compositions and methods for detecting or 
modulating angiogenesis associated sequences. 
20 In one aspect, the invention provides a method of detecting an angiogenesis- 

associated transcript in a cell in a patient, the method comprising contacting a biological 
sample from the patient with a polynucleotide that selectively hybridized to a sequence at 
least 80% identical to a sequence as shown in Tables 1-8. In one embodiment, the biological 
sample is a tissue sample. In another embodiment, the biological sample comprises isolated 
25 nucleic acids, which are often mRNA. 

In another embodiment, the method further comprises the step of amplifying 
nucleic acids before the step of contacting the biological sample with the polynucleotide. 
Often, the polynucleotide comprises a sequence as shown in Tables 1-8. The polynucleotide 
can be labeled, for example, with a fluorescent label and can be immobilized on a solid 
30 surface. 

In other embodiments the patient is undergoing a therapeutic regimen to treat a 
disease associated with angiogenesis or the patient is suspected of having an angiogenesis- 
associated disorder. 
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In another aspect, the invention comprises an isolated nucleic acid molecule 
consisting of a polynucleotide sequence as shown in Tables 1-8. The nucleic acid molecule 
can be labeled, for example, with a fluorescent label, 

In other aspects, the invention provides an expression vector comprising an 
5 isolated nucleic acid molecule consisting of a polynucleotide sequence as shown in Tables 1- 
8 or a host cell comprising the expression vector. 

In another embodiment, the isolated nucleic acid molecule encodes a 
polypeptide having an amino acid sequence as shown in Table 8. 

In another aspect, the invention provides an isolated polypeptide which is 
10 encoded by a nucleic acid molecule having polynucleotide sequence as shown in Tables 1 -8. 
In one embodiment, the isolated polypeptide has an amino acid sequence as shown in Table 

In another embodiment, the invention provides an antibody that specifically 

binds a polypeptide that has an amino acid sequence as shown in Table 8 or which is encoded 
15 by a nucleotide sequence of Tables 1-8 . The antibody can be conjugated or fused to an 

effector component such as a fluorescent label, a toxin, or a radioisotope. In some 

embodiments, the antibody is an antibody fragment or a humanized antibody. 

In another aspect, the invention provides a method of detecting a cell 

undergoing angiogenesis in a biological sample from a patient, the method comprising 
20 contacting the biological sample with an antibody that specifically binds to a polypeptide that 

has an amino acid sequence as shown in Table 8 or which is encoded by a nucleotide 

sequence of Tables 1-8 . In some embodiments, the antibody is further conjugated or fused to 

an effector component, for example, a fluorescent label. 

In another embodiment, the invention provides a method of detecting 
25 antibodies specific to angiogenesis in a patient, the method comprising contacting a 

biological sample from the patient with a polypeptide which is encoded by a nucleotide 

sequence of Tables 1-8. 

The invention also provides a method of identifying a compound that 

modulates the activity of an angiogenesis-associated polypeptide, the method comprising the 
30 steps of: (i) contacting the compound with a polypeptide that comprises at least 80% identity 

to an amino acid sequence as shown in Table 8 or which is encoded by a nucleotide sequence 

of Tables 1-8; and (ii) detecting an increase or a decrease in the activity of the polypeptide. 

In one embodiment, the polypeptide has an amino acid sequence as shown in Table 8 or is a 
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polypeptide encoded by a nucleotide sequence of Tables 1-8. In another embodiment, the 
polypeptide is expressed in a cell. 

The invention also provides a method of identifying a compound that 
modulates angiogenesis, the method comprising steps of: (i) contacting the compound with a 
5 cell undergoing angiogenesis; and (ii) detecting an increase or a decrease in the expression of 
a polypeptide sequence as shown in Table 8 or a polypeptide which is encoded by a 
nucleotide sequence of Tables 1-8. In one embodiment, the detecting step comprises 
hybridizing a nucleic acid sample from the cell with a polynucleotide that selectively 
hybridizes to a sequence at least 80% identical to a sequence as shown in Tables 1-8. In 

1 0 another embodiment, the method further comprises detecting an increase or decrease in the 
expression of a second sequence as shown in Table 8 or a polypeptide which is encoded by a 
nucleotide sequence of Tables 1-8 . 

In another embodiment, the invention provides a method of inhibiting 
angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as 

15 shown in Table 8 or which is 80% identical to a polypeptide encoded by a nucleotide 
sequence of Tables 1-8 , the method comprising the step of contacting the cell with a 
therapeutically effective amount of an inhibitor of the polypeptide. In one embodiment, the 
polypeptide has an amino acid sequence shown in Table 8 or is a polypeptide which is 
encoded by a nucleotide sequence of Tables 1-8 . In another embodiment, the inhibitor is an 

20 antibody. 

In other embodiments, the invention provides a method of activating 
angiogenesis in a cell that expresses a polypeptide at least 80% identical to a sequence as 
shown in Table 8 or at least 80% identical to a polypeptide which is encoded by a nucleotide 
sequence of Tables 1-8 , the method comprising the step of contacting the cell with a 
25 therapeutically effective amount of an activator of the polypeptide. In one embodiment, the 
polypeptide has an amino acid sequence shown in Table 8 or is a polypeptide which is 
encoded by a nucleotide sequence of Tables 1-8. 

Other aspects of the invention will become apparent to the skilled artisan by 
the following description of the invention. 



30 



Tables 1-8 provide nucleotide sequence of genes that exhibit changes in 
expression levels as a function of time in tissue undergoing angiogenesis compared to tissue 
that is not. 
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DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
In accordance with the objects outlined above, the present invention provides 
novel methods for diagnosis and treatment of disorders associated with angiogenesis 
(sometimes referred to herein as angiogenesis disorders or AD), as well as methods for 
5 screening for compositions which modulate angiogenesis. By "disorder associated with 

angiogenesis" or "disease associated with angiogenesis" herein is meant a disease state which 
is marked by either an excess or a deficit of blood vessel development. Angiogenesis 
disorders asociated with increased angiogenesis include, but are not limited to, cancer and 
proliferative diabetic retinopathy. Pathological states for which it may be desirable to 

10 increase angiogenesis include stroke, heart disease, infertility, ulcers, wound healing, 

ischemia, and scleradoma. Solid tumors typically require angiogenesis to support or sustain 
growth, e.g., breast, colon, lung, brain, bladder, and prostate tumors. Other AD include, e.g., 
arthritis, inflammatory bowel disease, diabetis retinopathy, macular degeneration, 
atherosclerosis, and psoriasis. Also provided are methods for treating AD. 

15 Definitions 

The term "angiogenesis protein" or "angiogenesis polynucleotide" refers to 
nucleic acid and polypeptide polymorphic variants, alleles, mutants, and interspecies 
homologs that: (1) have an amino acid sequence that has greater than about 60% amino acid 
sequence identity, 65%, 70%, 75%, 80%, 85%, 90%, preferably 91%, 92%, 93%, 94%, 95%, 

20 96%, 97%, 98% or 99% or greater amino acid sequence identity, preferably over a region of 
over a region of at least about 25, 50, 100, 200, 500, 1000, or more amino acids, to an 
angiogenesis protein sequence of Table 8; (2) bind to antibodies, e.g., polyclonal antibodies, 
raised against an immunogen comprising an amino acid sequence of Table 8, and 
conservatively modified variants thereof; (3) specifically hybridize under stringent 

25 hybridization conditions to an anti-sense strand corresponding to a nucleic acid sequence of 
Tables 1-8 and conservatively modified variants thereof; (4) have a nucleic acid sequence 
that has greater than about 95%, preferably greater than about 96%, 97%, 98%, 99%, or 
higher nucleotide sequence identity, preferably over a region of at least about 25, 50, 100, 
200, 500, 1000, or more nucleotides, to a sense sequence corresponding to one set out in 

30 Tables 1-8 . A polynucleotide or polypeptide sequence is typically from a mammal 

including, but not limited to, primate, e.g., human; rodent, e.g., rat, mouse, hamster; cow, pig, 
horse, sheep, or any mammal. An "angiogenesis polypeptide" and an "angiogenesis 
polynucleotide," include both naturally occurring or recombinant 
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A "full length" angiogenesis protein or nucleic acid refers to an agiogenesis 
polypeptide or polynucleotide sequence, or a variant thereof, that contains all of the elements 
normally contained in one or more naturally occurring, wild type angiogenesis polynucleotide 
or polypeptide sequences. The "full length" may be prior to, or after, various stages of post- 
5 translation processing. 

"Biological sample" as used herein is a sample of biological tissue or fluid that 
contains nucleic acids or polypeptides, e.g., of an angiogenic protein. Such samples include, 
but are not limited to, tissue isolated from primates, e.g., humans, or rodents, e.g., mice, and 
rats. Biological samples may also include sections of tissues such as biopsy and autopsy 
10 samples, and frozen sections taken for histologic purposes. A biological sample is typically 
obtained from a eukaryotic organism, most preferably a mammal such as a primate e.g. 9 
chimpanzee or human; cow; dog; cat; a rodent, e.g., guinea pig, rat, mouse; rabbit; or a bird; 
reptile; or fish. 

"Providing a biological sample" means to obtain a biological sample for use in 
15 methods described in this invention. Most often, this will be done by removing a sample of 
cells from an animal, but can also be accomplished by using previously isolated cells (e.g., 
isolated by another person, at another time, and/or for another purpose), or by performing the 
methods of the invention in vivo. Archival tissues, having treatment or outcome histroy, will 
be particularly useful. 

20 The terms "identical" or percent "identity," in the context of two or more 

nucleic acids or polypeptide sequences, refer to two or more sequences or subsequences that 
are the same or have a specified percentage of amino acid residues or nucleotides that are the 
same (i.e., about 70% identity, preferably 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 
96%, 97%, 98%, 99%, or higher identity over a specified region (e.g., SEQ ID NOS: 1-229), 

25 when compared and aligned for maximum correspondence over a comparison window or 
designated region) as measured using a BLAST or BLAST 2.0 sequence comparison 
algorithms with default parameters described below, or by manual alignment and visual 
inspection (see, e.g., NCBI web site http://www.ncbi.nlm.nih.gov/BLAST/ or the like). Such 
sequences are then said to be "substantially identical." This definition also refers to, or may 

30 be applied to, the compliment of a test sequence. The definition also includes sequences that 
have deletions and/or additions, as well as those that have substitutions. As described below, 
the preferred algorithms can account for gaps and the like. Preferably, identity exists over a 
region that is at least about 25 amino acids or nucleotides in length, or more preferably over a 
region that is 50-100 amino acids or nucleotides in length. 
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For sequence comparison, typically one sequence acts as a reference sequence, 
to which test sequences are compared. When using a sequence comparison algorithm, test 
and reference sequences are entered into a computer, subsequence coordinates are designated, 
if necessary, and sequence algorithm program parameters are designated. Preferably, default 
5 program parameters can be used, or alternative parameters can be designated The sequence 
comparison algorithm then calculates the percent sequence identities for the test sequences 
relative to the reference sequence, based on the program parameters. 

A "comparison window", as used herein, includes reference to a segment of 
any one of the number of contiguous positions selected from the group consisting of from 20 

10 to 600, usually about 50 to about 200, more usually about 100 to about 150 in which a 
sequence may be compared to a reference sequence of the same number of contiguous 
positions after the two sequences are optimally aligned. Methods of alignment of sequences 
for comparison are well-known in the art. Optimal alignment of sequences for comparison 
can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. 

15 Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol 
Biol 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Natl 
Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms 
(GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, 
Genetics Computer Group, 575 Science Dr., Madison, WI), or by manual ali gnm ent and 

20 visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 
1995 supplement)). 

A preferred example of algorithm that is suitable for determining percent 
sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which 
are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and Altschul et al, J. 

25 Mol. Biol 215:403-410 (1990), respectively. BLAST and BLAST 2.0 are used, with the 

parameters described herein, to determine percent sequence identity for the nucleic acids and 
proteins of the inventioa Software for performing BLAST analyses is publicly available 
through the National Center for Biotechnology Information (http:7Avww.ncbi.nlm.nih.gov/). 
This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying 

30 short words of length W in the query sequence, which either match or satisfy some positive- 
valued threshold score T when aligned with a word of the same length in a database 
sequence. T is referred to as the neighborhood word score threshold (Altschul et al 9 supra). 
These initial neighborhood word hits act as seeds for initiating searches to find longer HSPs 
containing them. The word hits are extended in both directions along each sequence for as 

7 
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far as the cumulative alignment score can be increased. Cumulative scores are calculated 
using, for nucleotide sequences, the parameters M (reward score for a pair of matching 
residues; always > 0) and N (penalty score for mismatching residues; always < 0). For amino 
acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the 
5 word hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below, 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
either sequence is reached. The BLAST algorithm parameters W, T, and X determine the 
sensitivity and speed of the alignment The BLASTN program (for nucleotide sequences) 

1 0 uses as defaults a wordlength (W) of 1 1 , an expectation (E) of 1 0, M=5, N=-4 and a 
comparison of both strands. For amino acid sequences, the BLASTP program uses as 
defaults a wordlength of 3, and expectation (E) of 10, and the BLQSUM62 scoring matrix 
(see Henikoff & Henikoff, Proc. Natl Acad. Set USA 89:10915 (1989)) alignments (B) of 
50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands. 

15 The BLAST algorithm also performs a statistical analysis of the similarity 

between two sequences (see, e.g., Karlin & Altschul, Proc. Natl Acad Sci. USA 90:5873- 
5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest 
sum probability (P(N)), which provides an indication of the probability by which a match 
between two nucleotide or amino acid sequences would occur by chance. For example, a 

20 nucleic acid is considered similar to a reference sequence if the smallest sum probability in a 
comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more 
preferably less than about 0.01, and most preferably less than about 0.001. 

An indication that two nucleic acid sequences or polypeptides are substantially 
identical is that the polypeptide encoded by the first nucleic acid is immunologically cross 

25 reactive with the antibodies raised against the polypeptide encoded by the second nucleic 
acid, as described below. Thus, a polypeptide is typically substantially identical to a second 
polypeptide, for example, where the two peptides differ only by conservative substitutions. 
Another indication that two nucleic acid sequences are substantially identical is that the two 
molecules or their complements hybridize to each other under stringent conditions, as 

30 described below. Yet another indication that two nucleic acid sequences are substantially 
identical is that the same primers can be used to amplify the sequences. 

A "host cell" is a naturally occurring cell or a transformed cell that contains an 
expression vector and supports the replication or expression of the expression vector. Host 
cells may be cultured cells, explants, cells in vivo, and the like. Host cells may be 
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prokaryotic cells such as E. coli 9 or eukaiyotic cells such as yeast, insect, amphibian, or 
mammalian cells such as CHO, HeLa, and the like (see, e.g., the American Type Culture 
Collection catalog or web site, www.atcc.org). 

The terms "polypeptide," '^peptide" and "protein" are used interchangeably 
5 herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers 
in which one or more amino acid residue is an artificial chemical mimetic of a corresponding 
naturally occurring amino acid, as well as to naturally occurring amino acid polymers and 
non-naturally occurring amino acid polymer. 

The term "amino acid" refers to naturally occurring and synthetic amino acids, 

10 as well as amino acid analogs and amino acid mimetics that function in a manner similar to 
the naturally occurring amino acids. Naturally occurring amino acids are those encoded by 
the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, y- 
carboxyglutamate, and O-phosphoserine. Amino acid analogs refers to compounds that have 
the same basic chemical structure as a naturally occurring amino acid, i.e., an a carbon that is 

15 bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, 
norieucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified 
R groups (e.g., norieucine) or modified peptide backbones, but retain the same basic chemical 
structure as a naturally occurring amino acid. Amino acid mimetics refers to chemical 
compounds that have a structure that is different from the general chemical structure of an 

20 amino acid, but that functions in a manner similar to a naturally occurring amino acid. 

Amino acids may be referred to herein by either their commonly known three 
letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical 
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly 
accepted single-letter codes. 

25 "Conservatively modified variants" applies to both amino acid and nucleic 

acid sequences. With respect to particular nucleic acid sequences, conservatively modified 
variants refers to those nucleic acids which encode identical or essentially identical amino 
acid sequences, or where the nucleic acid does not encode an amino acid sequence, to 
essentially identical sequences. Because of the degeneracy of the genetic code, a large 

30 number of functionally identical nucleic acids encode any given protein. For instance, the 
codons GCA, GCC, GCG and GCU all encode the amino acid alanine. Thus, at every 
. position where an alanine is specified by a codon, the codon can be altered to any of the 
corresponding codons described without altering the encoded polypeptide. Such nucleic acid 
variations are "silent variations," which are one species of conservatively modified 
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variations. Eveiy nucleic acid sequence herein which encodes a polypeptide also describes 
every possible silent variation of the nucleic acid. One of skill will recognize that each codon 
in a nucleic acid (except AUG, which is ordinarily the only codon for methionine, and TGG, 
which is ordinarily the only codon for tryptophan) can be modified to yield a functionally 
5 identical molecule. Accordingly, each silent variation of a nucleic acid which encodes a 
polypeptide is implicit in each described sequence with respect to the expression product, but 
not with respect to actual probe sequences. 

As to amino acid sequences, one of skill will recognize that individual 
substitutions, deletions or additions to a nucleic acid, peptide, polypeptide, or protein 

1 0 sequence which alters, adds or deletes a single amino acid or a small percentage of amino 
acids in the encoded sequence is a "conservatively modified variant" where the alteration 
results in the substitution of an amino acid with a chemically similar amino acid. 
Conservative substitution tables providing functionally similar amino acids are well known in 
the art. Such conservatively modified variants are in addition to and do not exclude 

1 5 polymorphic variants, interspecies homologs, and alleles of the invention. 

The following eight groups each contain amino acids that are conservative 
substitutions for one another 1) Alanine (A), Glycine (G); 2) Aspartic acid (D), Glutamic acid 
(E); 3) Asparagine (N), Glutamine (Q); 4) Arginine (R), Lysine (K); 5) Isoleucine (I), 
Leucine (L), Methionine (M), Valine (V); 6) Phenylalanine (F), Tyrosine (Y), Tryptophan 

20 (W); 7) Serine (S), Threonine (T); and 8) Cysteine (C), Methionine (M) (see, Creighton, 
Proteitis (1984)). 

Macromolecular structures such as polypeptide structures can be described in 
terms of various levels of organization. For a general discussion of this organization, see, 
e.g., Alberts et al., Molecular Biology of the Cell (3 rd eA, 1994) and Cantor and Schimmel, 

25 Biophysical Chemistry Part I: The Conformation of Biological Macromolecules (1 980). 
"Primary structure" refers to the amino acid sequence of a particular peptide. "Secondary 
structure" refers to locally ordered, three dimensional structures within a polypeptide. These 
structures are commonly known as domains. Domains are portions of a polypeptide that 
form a compact unit of the polypeptide and are typically 25 to approximately 500 amino 

30 acids long. Typical domains are made up of sections of lesser organization such as stretches 
of p-sheet and a-helices. 'Tertiary structure" refers to the complete three dimensional 
structure of a polypeptide monomer. "Quaternary structure" refers to the three dimensional 
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structure formed, usually by the noncovalent association of independent tertiary units. 
Anisotropic terms are also known as energy terms. 

A "label" or a "detectable 010161/' is a composition detectable by 
spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical 
5 means. For example, useful labels include 32 P, fluorescent dyes, elecfron-dense reagents, 
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins 
which can be made detectable, e.g., by incorporating a radiolabel into the peptide or used to 
detect antibodies specifically reactive with the peptide. 

An "effector" or "effector moiety" or "effector component" is a molecule that 

1 0 is bound (or linked, or conjugated), either covalently, through a linker or a chemical bond, or 
noncovalently, through ionic, van der Waals, electrostatic, or hydrogen bonds, to an antibody. 
The "effector" can be a variety of molecules including, for example, detection moieties 
including radioactive compounds, fluroescent compounds, an enzyme or substrate, tags such 
as epitope tags, a toxin; a chemotherapeutic agent; a lipase; an antibiotic; or a radioisotope 

15 emitting "hard" e.g., beta radiation. 

A "labeled nucleic acid probe or oligonucleotide" is one that is bound, either 
covalently, through a linker or a chemical bond, or noncovalently, through ionic, van der 
Waals, electrostatic, or hydrogen bonds to a label such that the presence of the probe may be 
detected by detecting the presence of the label bound to the probe. Alternatively, method 

20 using high affinity interactions may achieve the same results where one of a pair of binding 
partners binds to the other, e.g., biotin, streptavidin. 

As used herein a "nucleic acid probe or oligonucleotide" is defined as a 
nucleic acid capable of binding to a target nucleic acid of complementary sequence through 
one or more types of chemical bonds, usually through complementary base pairing, usually 

25 through hydrogen bond formation. As used herein, a probe may include. natural (z.e., A, G, C, 
or T) or modified bases (7-deazaguanosine, inosine, etc.). In addition, the bases in a probe 
may be joined by a linkage other than a phosphodiester bond, so long as it does not interfere 
with hybridization. Thus, for example, probes may be peptide nucleic acids in which the 
constituent bases are joined by peptide bonds rather than phosphodiester linkages. It will be 

30 understood by one of skill in the art that probes may bind target sequences lacking complete 
complementarity with the probe sequence depending upon the stringency of the hybridization 
conditions. The probes are preferably directly labeled as with isotopes, chromophores, 
lumiphores, chromogens, or indirectly labeled such as with biotin to which a streptavidin 
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complex may later bind. By assaying for the presence or absence of the probe, one can detect 
the presence or absence of the select sequence or subsequence. 

The term "recombinant" when used with reference, e.g., to a cell, or nucleic 
acid, protein, or vector, indicates that the cell, nucleic acid, protein or vector, has been 
5 modified by the introduction of a heterologous nucleic acid or protein or the alteration of a 
native nucleic acid or protein, or that the cell is derived from a cell so modified. Thus, for 
example, recombinant cells express genes that are not found within the native (non- 
recombinant) form of the cell or express native genes that are otherwise abnormally 
expressed, under expressed or not expressed at all. 

1 0 The term "heterologous" when used with reference to portions of a nucleic 

acid indicates that the nucleic acid comprises two or more subsequences that are not found in 
the same relationship to each other in nature. For instance, the nucleic acid is typically 
recombinantly produced, having two or more sequences from unrelated genes arranged to 
make a new functional nucleic acid, e.g., a promoter from one source and a coding region 

15 from another source. Similarly, a heterologous protein indicates that the protein comprises 
two or more subsequences that are not found in the same relationship to each other in nature 
(e.g., a fusion protein). 

A "promoter" is defined as an array of nucleic acid control sequences that 
direct transcription of a nucleic acid. As used herein, a promoter includes necessary nucleic 

20 acid sequences near the start site of transcription, such as, in the case of a polymerase II type 
promoter, a TATA element. A promoter also optionally includes distal enhancer or repressor 
elements, which can be located as much as several thousand base pairs from the start site of 
transcription. A "constitutive" promoter is a promoter that is active under most 
environmental and developmental conditions. An "inducible" promoter is a promoter that is 

25 active under environmental or developmental regulation. The term "operably linked" refers 
to a functional linkage between a nucleic acid expression control sequence (such as a 
promoter, or array of transcription factor binding sites) and a second nucleic acid sequence, 
wherein the expression control sequence directs transcription of the nucleic acid 
corresponding to the second sequence. 

30 An "expression vector" is a nucleic acid construct, generated recombinantly or 

synthetically, with a series of specified nucleic acid elements that permit transcription of a 
particular nucleic acid in a host cell. The expression vector can be part of a plasmid, virus, or 
nucleic acid fragment. Typically, the expression vector includes a nucleic acid to be 
transcribed operably linked to a promoter. 
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The phrase "selectively (or specifically) hybridizes to" refers to the binding, 
duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under 
stringent hybridization conditions when that sequence is present in a complex mixture (e.g., 
total cellular or library DNA or RNA). 
5 The phrase "stringent hybridization conditions" refers to conditions under 

which a probe will hybridize to its target subsequence, typically in a complex mixture of 
nucleic acids, but to no other sequences. Stringent conditions are sequence-dependent and 
will be different in different circumstances. Longer sequences hybridize specifically at 
higher temperatures. An extensive guide to the hybridization of nucleic acids is found in 

10 Tijssen, Techniques in Biochemistry and Molecular Biology-Hybridization with Nucleic 
Probes, "Overview of principles of hybridization and the strategy of nucleic acid assays" 
(1993). Generally, stringent conditions are selected to be about 5-10°C lower than the 
thermal melting point (TnO for the specific sequence at a defined ionic strength pH. The T m is 
the temperature (under defined ionic strength, pH, and nucleic concentration) at which 50% 

15 of the probes complementary to the target hybridize to the target sequence at equilibrium (as 
the target sequences are present in excess, at T m , 50% of the probes are occupied at 
equilibrium). Stringent conditions will be those in which the salt concentration is less than 
about 1 .0 M sodium ion, typically about 0.01 to 1.0 M sodium ion concentration (or other 
salts) at pH 7.0 to 8.3 and the temperature is at least about 30°C for short probes (e.g., 10 to 

20 50 nucleotides) and at least about 60°C for long probes (e.g., greater than 50 nucleotides). 
Stringent conditions may also be achieved with the addition of destabilizing agents such as 
formamide. For selective or specific hybridization, a positive signal is at least two times 
background, preferably 10 times background hybridization. Exemplary stringent 
hybridization conditions can be as following: 50% formamide, 5x SSC, and 1% SDS, 

25 incubating at 42°C, or, 5x SSC, 1% SDS, incubating at 65°C, with wash in 0.2x SSC, and 
0.1% SDS at 65°C. For PCR, a temperature of about 36°C is typical for low stringency 
amplification, although annealing temperatures may vary between about 32°C and 48°C 
depending on primer length- For high stringency PCR amplification, a temperature of about 
62°C is typical, although high stringency annealing temperatures can range from about 50°C 

30 to about 65°C, depending on the primer length and specificity. Typical cycle conditions for 
both high and low stringency amplifications include a denaturation phase of 90°C - 95°C for 
30 sec - 2 min., an annealing phase lasting 30 sec. - 2 min, and an extension phase of about 
72°C for 1 - 2 min. Protocols and guidelines for low and high stringency amplification 
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reactions are provided, e.g., in Innis et al (1990) PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc. N.Y.). 

Nucleic acids that do not hybridize to each other under stringent conditions are 
still substantially identical if the polypeptides which they encode are substantially identical. 
5 This occurs, for example, when a copy of a nucleic acid is created using the ma ximum codon 
degeneracy permitted by the genetic code. In such cases, the nucleic acids typically hybridize 
under moderately stringent hybridization conditions. Exemplary "moderately stringent 
hybridization conditions" include a hybridization in a buffer of 40% formamide, 1 M NaCl, 
1% SDS at 37°C, and a wash in IX SSC at 45°C. A positive hybridization is at least twice 

10 background. Those of ordinary skill will readily recognize that alternative hybridization and 
wash conditions can be utilized to provide conditions of similar stringency. Additional 
guidelines for detennioing hybridization parameters are provided in numerous reference, e.g., 
and Current Protocols in Molecular Biology, ed. Ausubel, et al 

The phrase "functional effects" in the context of assays for testing compounds 

1 5 that modulate activity of an angiogenesis protein includes the determination of a parameter 
that is indirectly or directly under the influence of the angiogenesis protein, e.g., a functional, 
physical, or chemical effect, such as the ability to increase or decrease angiogenesis. It 
includes binding activity, the ability of cells to proliferate, expression in cells undergoing 
angiogenesis, and other characteristics of angiogenic cells, functional effects" include in 

20 vitro, in vivo, and ex vivo activities. 

By "determining the functional effect" is meant assaying for a compound that 
increases or decreases a parameter that is indirectly or directly under the influence of an 
angiogenesis protein sequence, e.g., functional, physical and chemical effects. Such 
functional effects can be measured by any means known to those skilled in the art, e.g., 

25 changes in spectroscopic characteristics (e.g, fluorescence, absorbance, refractive index), 
hydrodynamic (e.g, shape), chromatographic, or solubility properties for the protein, 
measuring inducible markers or transcriptional activation of the angiogenesis protein; 
measuring binding activity or binding assays, e.g. binding to antibodies, and measuring 
cellular proliferation, particularly endothelial cell proliferation, cell viability, cell division 

30 especially of endothelial cells, lumen formation and capillary or vessel growth or formation. 
Determination of the functional effect of a compound on angiogenesis can also be performed 
using angiogenesis assays known to those of skill in the art such as an in vitro assays, e.g, in 
vitro endothelial cell tube formation assays, and other assays such as the chick CAM assay, 
the mouse corneal assay, and assays that assess vascularization of an implanted tumor. The 
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functional effects can be evaluated by many means known to those skilled in the art, e.g., 
microscopy for quantitative or qualitative measures of alterations in morphological features, 
e.g., tube or blood Vessel formation, measurement of changes in RNA or protein levels for 
angiogenesis-associated sequences, measurement of RNA stability, identification of 
5 downstream or reporter gene expression (CAT, luciferase, 0-gal, GFP and the like), e.g., via 
chemiluminescence, fluorescence, colorimetric reactions, antibody binding, inducible 
markers, and ligand binding assays. 

"Inhibitors", "activators", and "modulators" of angiogenic polynucleotide and 
polypeptide sequences are used to refer to activating, inhibitory, or modulating molecules 

10 identified using in vitro and in vivo assays of angiogenic polynucleotide and polypeptide 
sequences. Inhibitors are compounds that, e.g., bind to, partially or totally block activity, 
decrease, prevent, delay activation, inactivate, desensitize, or down regulate the activity or 
expression of angiogenesis proteins, e.g., antagonists. "Activators" are compounds that 
increase, open, activate, facilitate, enhance activation, sensitize, agonize, or up regulate 

15 angiogenesis protein activity. Inhibitors, activators, or modulators also include genetically 
modified versions of angiogenesis proteins, e.g., versions with altered activity, as well as 
naturally occurring and synthetic ligands, antagonists, agonists, antibodies, small chemical 
molecules and the like. Such assays for inhibitors and activators include, e.g., expressing the 
angiogenic protein in vitro, in cells, or cell membranes, applying putative modulator 

20 compounds, and then determining the functional effects on activity, as described above. 
Activators and inhibitors of angiogenesis can also be identified by incubating angiogenic 
cells with the test compound and determining increases or decreases in the expression of 1 or 
more angiogenesis proteins, e.g., 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 40, 50 or more angiogenesis 
proteins, such as angiogenesis proteins comprising the sequences set out in Table 8. 

25 Samples or assays comprising angiogenesis proteins that are treated with a 

potential activator, inhibitor, or modulator are compared to control samples without the 
inhibitor, activator, or modulator to examine the extent of inhibition. Control samples 
(untreated with inhibitors) are assigned a relative protein activity value of 100%. Inhibition 
of a polypeptide is achieved when the activity value relative to the control is about 80%, 

30 preferably 50%, more preferably 25-0%. Activation of an angiogenesis polypeptide is 

achieved when the activity value relative to the control (untreated with activators) is 1 1 0%, 
more preferably 150%, more preferably 200-500% (i.e., two to five fold higher relative to the 
control), more preferably 1000-3000% higher. 
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"Antibody" refers to a polypeptide comprising a framework region from an 
immunoglobulin gene or fragments thereof that specifically binds and recognizes an antigen. 
The recognized immunoglobulin genes include the kappa, lambda, alpha, gamma, delta, 
epsilon, and mu constant region genes* as well as the myriad immunoglobulin variable region 
5 genes. Light chains are classified as either kappa or lambda. Heavy chains are classified as 
gamma, mu, alpha, delta, or epsilon, which in turn define the immunoglobulin classes, IgG, 
IgM, IgA, IgD and IgE, respectively. Typically, the antigen-binding region of an antibody 
will be most critical in specificity and affinity of binding. 

An exemplary immunoglobulin (antibody) structural unit comprises a 

10 . tetramer. Each tetramer is composed of two identical pairs of polypeptide chains, each pair 
having one "light" (about 25 kD) and one "heavy" chain (about 50-70 kD). The N-terminus 
of each chain defines a variable region of about 100 to 1 10 or more amino acids primarily 
responsible for antigen recognition. The terms variable light chain (Vl) and variable heavy 
chain (Vh) refer to these light and heavy chains respectively. 

15 Antibodies exist, e.g., as intact immunoglobulins or as a number of well- 

characterized fragments produced by digestion with various peptidases. Thus, for example, 
pepsin digests an antibody below the disulfide linkages in the hinge region to produce F(ab) , 2, 
a dimer of Fab which itself is a light chain joined to Vh-Ch1 by a disulfide bond. The F(ab)'2 
may be reduced under mild conditions to break the disulfide linkage in the hinge region, 

20 thereby converting the F(ab)' 2 dimer into an Fab' monomer. The Fab' monomer is 

essentially Fab with part of the hinge region (see Fundamental Immunology (Paul ed., 3d ed. 
1993). While various antibody fragments are defined in terms of the digestion of an intact 
antibody, one of skill will appreciate that such fragments may be synthesized de novo either 
chemically or by using recombinant DNA methodology. Thus, the term antibody, as used 

25 herein, also includes antibody fragments either produced by the modification of whole 

antibodies, or those synthesized de novo using recombinant DNA methodologies (e.g., single 
chain Fv) or those identified using phage display libraries (see, e.g., McCafferty et al, Nature 
348:552-554(1990)) 

For preparation of antibodies, e.g., recombinant, monoclonal, or polyclonal 

30 antibodies, many technique known in the art can be used (see, e.g, Kohler & Milstein, 

Nature 256:495-497 (1975); Kozbor et al., Immunology Today 4: 72 (1983); Cole et al, pp. 
11 r -96 in Monoclonal Antibodies and Cancer Therapy, AlanR. Liss, Inc. (1985); Coligan, 
Current Protocols in Immunology (1991); Harlow & Lane, Antibodies, A Laboratory Manual 
(1988); and Goding, Monoclonal Antibodies: Principles and Practice (2d ed. 1986)). 
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Techniques for the production of single chain antibodies (U.S. Patent 4,946,778) can be 
adapted to produce antibodies to polypeptides of this invention. Also, transgenic mice, or 
other organisms such as other mammals, may be used to express humanized antibodies. 
Alternatively, phage display technology can be used to identify antibodies and heteromeric 
Fab fragments that specifically bind to selected antigens {see, e.g., McCafferty et al y Nature 
348:552-554 (1990); Marks et al, Biotechnology 10:779-783 (1992)). 

A "chimeric antibody" is an antibody molecule in which (a) the constant 
region, or a portion thereof, is altered, replaced or exchanged so that the antigen binding site 
(variable region) is linked to a constant region of a different or altered class, effector function 
and/or species, or an entirely different molecule which confers new properties to the chimeric 
antibody, e.g., an enzyme, toxin, hormone, growth factor, drug, etc.; or (b) the variable 
region, or a portion thereof is altered, replaced or exchanged with a variable region having a 
different or altered antigen specificity. 

The detailed description of the invention includes discussion of the following 
aspects of the invention: Expression of angiogenesis-associated sequences 

Informatics 

Angiogenesis-associated sequences 

Detection of angiogenesis sequence for diagnostic and 
therapeutic applications 

Modulators of angiogenesis 

Methods of identifying variant angiogenesis-associated 
sequences 

Administration of pharmaceutical and vaccine compositions 
Kits for use in diagnostic and/or prognostic applications. 

Expression of angiogenesis-associated sequences 

In one aspect, the expression levels of genes are determined in different 
patient samples for which diagnosis information is desired, to provide expression profiles. 
An expression profile of a particular sample is essentially a "fingerprint" of the state of the 
sample; while two states may have any particular gene similarly expressed, the evaluation of 
a number of genes simultaneously allows the generation of a gene expression profile that is 
unique to the state of the cell. That is, normal tissue may be distinguished from AD tissue. 
By comparing expression profiles of tissue in known different angiogenesis states, 
information regarding which genes are important (including both up- and down-regulation of 
genes) in each of these states is obtained The identification of sequences that are 
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differentially expressed in angiogenic versus non-angiogenic tissue allows the use of this 
information in a number of ways. For example, a particular treatment regime may be 
evaluated: does a chemotherapeutic drug act to down-regulate angiogenesis, and thus tumor 
growth or recurrence, in a particular patient Similarly, diagnosis and treatment outcomes 
5 may be done or confirmed by comparing patient samples with the known expression profiles. 
Angiogenic tissue can also be analyzed to determine the stage of angiogenesis in the tissue. 
Furthermore, these gene expression profiles (or individual genes) allow screening of drug 
candidates with an eye to mimicking or altering a particular expression profile; for example, 
screening can be done for drugs that suppress the angiogenic expression profile. This may be 

10 done by making biochips comprising sets of the important angiogenesis genes, which can 
then be used in these screens. These methods can also be done on the protein basis; that is, 
protein expression levels of the angiogenic proteins can be evaluated for diagnostic purposes 
or to screen candidate agents. In addition, the angiogenic nucleic acid sequences can be 
administered for gene therapy purposes, including the administration of antisense nucleic 

1 5 acids, or the angiogenic proteins (including antibodies and other modulators thereof) 
administered as therapeutic drugs. 

Thus the present invention provides nucleic acid and protein sequences that 
are differentially expressed in angiogenesis, herein termed "angiogenesis sequences". As 
outlined below, angiogenesis sequences include those that are up-regulated (i.e. expressed at 

20 a higher level) in disorders associated with angiogenesis, as well as those that are down- 
regulated (i.e. expressed at a lower level). In a preferred embodiment, the angiogenesis 
sequences are from humans; however, as will be appreciated by those in the art, angiogenesis 
sequences from other organisms may be useful in animal models of disease and drug 
evaluation; thus, other angiogenesis sequences are provided, from vertebrates, including 

25 mammals, including rodents (rats, mice, hamsters, guinea pigs, etc.), primates, farm animals 
(including sheep, goats, pigs, cows, horses, etc). Angiogenesis sequences from other 
organisms may be obtained using the techniques outlined below. 

Angiogenesis sequences can include both nucleic acid and amino acid 
sequences. In a preferred embodiment, the angiogenesis sequences are recombinant nucleic 

30 acids. By the term "recombinant nucleic acid" herein is meant nucleic acid, originally formed 
in vitro, in general, by the manipulation of nucleic acid e.g., using polymerases and 
endonucleases, in a form not normally found in nature. Thus an isolated nucleic acid, in a 
linear form, or an expression vector formed in vitro by ligating DNA molecules that are not 
normally joined, are both considered recombinant for the purposes of this invention. It is 
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understood that once a recombinant nucleic acid is made and reintroduced into a host cell or 
organism, it will replicate non-recombinantly, i.e. using the in vivo cellular machinery of the 
host cell rather than in vitro manipulations; however, such nucleic acids, once produced 
recombinantly, although subsequently replicated non-recombinantly, are still considered 
5 recombinant for the purposes of the invention. 

Similarly, a "recombinant protein" is a protein made using recombinant 
techniques, i.e. through the expression of a recombinant nucleic acid as depicted above. A 
recombinant protein is distinguished from naturally occurring protein by at least one or more 
characteristics. For example, the protein may be isolated or purified away from some or all 

10 of the proteins and compounds with which it is normally associated in its wild type host, and 
thus may be substantially pure. For example, an isolated protein is unaccompanied by at least 
some of the material with which it is normally associated in its natural state, preferably 
constituting at least about 0.5%, more preferably at least about 5% by weight of the total 
protein in a given sample. A substantially pure protein comprises at least about 75% by 

1 5 weight of the total protein, with at least about 80% being preferred, and at least about 90% 
being particularly preferred The definition includes the production of an angiogenesis protein 
from one organism in a different organism or host cell. Alternatively, the protein may be 
made at a significantly higher concentration than is normally sera, through the use of an 
inducible promoter or high expression promoter, such that the protein is made at increased 

20 concentration levels. Alternatively, the protein may be in a form not normally found in 
nature, as in the addition of an epitope tag or amino acid substitutions, insertions and 
deletions, as discussed below. 

In a preferred embodiment, the angiogenesis sequences are nucleic acids. As 
will be appreciated by those in the art and is more fully outlined below, angiogenesis 

25 sequences are useful in a variety of applications, including diagnostic applications, which will 
detect naturally occurring nucleic acids, as well as screening applications; for example, 
biochips comprising nucleic acid probes to the angiogenesis sequences can be generated. In 
the broadest sense, then, by €< nucleic acid" or "oligonucleotide" or grammatical equivalents 
herein means at least two nucleotides covalently linked together. A nucleic acid of the 

30 present invention will generally contain phosphodiester bonds, although in some cases, 
nucleic acid analogs are included that may have alternate backbones, comprising, for 
example, phosphoramidate, phosphorothioate, phosphorodithioate, or O- 
methylphophoroamidite linkages, (see Eckstein, Oligonucleotides and Analogues: A Practical 
Approach, Oxford University Press); and peptide nucleic acid backbones and linkages. Other 
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analog nucleic acids include those with positive backbones; non-ionic backbones, and non- 
ribose backbones, including those described in U.S. Patent Nos. 5,235,033 and 5,034,506, 
and Chapters 6 and 7, ASC Symposium Series 580, "Carbohydrate Modifications in 
Antisense Research", Ed Y.S. Sanghui and P. Dan Cook. Nucleic acids containing one or 
5 more caibocyclic sugars are also included within one definition of nucleic acids. 

Modifications of the ribose-phosphate backbone may be done for a variety of reasons, for 
example to increase the stability and half-life of such molecules in physiological 
environments or as probes on a biochip. 

As will be appreciated by those in the art, nucleic acid analogs may find use in 

10 the present invention. In addition, mixtures of naturally occurring nucleic acids and analogs 
can be made; alternatively, mixtures of different nucleic acid analogs, and mixtures of 
naturally occurring nucleic acids and analogs may be made. 

Particularly preferred are peptide nucleic acids (PNA) which includes peptide 
nucleic acid analogs. These backbones are substantially non-ionic under neutral conditions, in 

1 5 contrast to the highly charged phosphodiester backbone of naturally occurring nucleic acids. 
This results in two advantages. First, the PNA backbone exhibits improved hybridization 
kinetics. PNAs have larger changes in the melting temperature (Tm) for mismatched versus 
perfectly matched basepairs. DNA and RNA typically exhibit a 2-4°C drop in T m for an 
internal mismatch. With the non-ionic PNA backbone, the drop is closer to 7-9°C. Similarly, 

20 due to their non-ionic nature, hybridization of the bases attached to these backbones is 
relatively insensitive to salt concentration. In addition, PNAs are not degraded by cellular 
enzymes, and thus can be more stable. 

The nucleic acids may be single stranded or double stranded, as specified, or 
contain portions of both double stranded or single stranded sequence. As will be appreciated 

25 by those in the art, the depiction of a single strand also defines the sequence of the 

complementary strand; thus the sequences described herein also provide the complement of 
the sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA or a hybrid, 
where the nucleic acid may contain combinations of deoxyrfbo- and ribo-nucleotides, and 
combinations of bases, including uracil, adenine, thymine, cytosine, g uanin e, inosine, 

30 xanthine hypoxanthine, isocytosine, isoguanine, etc. As used herein, the term "nucleoside" 
includes nucleotides and nucleoside and nucleotide analogs, and modified nucleosides such 
as amino modified nucleosides. In addition, "nucleoside" includes non-naturally occurring 
analog structures. Thus for example the individual units of a peptide nucleic acid, each 
containing a base, are referred to herein as a nucleoside. 
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An angiogenesis sequence can be initially identified by substantial nucleic 
acid and/or amino acid sequence homology to the angiogenesis sequences outlined herein. 
Such homology can be based upon the overall nucleic acid or amino acid sequence, and is 
generally determined as outlined below, using either homology programs or hybridization 
5 conditions. 

For identifying angiogenesis-associated sequences, the angiogenesis screen 
typically includes comparing genes identified in a modification of an in vitro model of 
angiogenesis as described in Hiraoka, Cell 95:365 (1998) with genes identified in controls. 
Samples of normal tissue and tissue undergoing angiogenesis are applied to biochips 

1 0 comprising nucleic acid probes. The samples are first microdissected, if applicable, and 
treated as is known in the art for the preparation of mRNA. Suitable biochips are 
commercially available, for example from Affymetrix. Gene expression profiles as described 
herein are generated and the data analyzed. 

In a preferred embodiment, the genes showing changes in expression as 

15 between normal and disease states are compared to genes expressed in other normal tissues, 
including, but not limited to lung, heart, brain, liver, breast, kidney, muscle, prostate, small 
intestine, large intestine, spleen, bone and placenta. In a preferred embodiment, those genes 
identified during the angiogenesis screen that are expressed in any significant amount in other 
tissues are removed from the profile, although in some embodiments, this is not necessary. 

20 That is, when screening for drugs, it is usually preferable that the target be disease specific, to 
minimise possible side effects. 

In a preferred embodiment, angiogenesis sequences are those that are up- 
regulated in angiogenesis disorders; that is, the expression of these genes is higher in the 
disease tissue as compared to normal tissue. 'Tip-regulation" as used herein means at least 

25 about a two-fold change, preferably at least about a three fold change, with at least about 
five-fold or higher being preferred. All accession numbers herein are for the GenBank 
sequence database and the sequences of the accession numbers are hereby expressly 
incorporated by reference. GenBank is known in the art, see, e.g., Benson, DA, et al., 
Nucleic Acids Research 26:1-7 (1998) and http^Avww.ncbi.nlm.nih.gov/. Sequences are also 

30 avialable in other databases, e.g., European Molecular Biology Laboratory (EMBL) and 
DNA Database of Japan (DDB J). In addition, most preferred genes were found to be 
expressed in a limited amount or not at all in heart, brain, lung, liver, breast, kidney, prostate, 
small intestine and spleen. 
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In another preferred embodiment, angiogenesis sequences are those that are 
down-regulated in the angiogenesis disorder, that is, the expression of these genes is lower in 
angiogenic tissue as compared to normal tissue. "Down-regulation" as used herein means at 
least about a two-fold change, preferably at least about a three fold change, with at least about 
5 five-fold or higher being preferred 

Angiogenesis sequences according to the invention may be classified into 
discrete clusters of sequences based on common expression profiles of the sequences. 
Expression levels of angiogenesis sequences may increase or decrease as a function of time in 
a manner that correlates with the induction of angiogenesis. Alternatively, expression levels 

10 of angiogenesis sequences may both increase and decrease as a function of time. For 
example, expression levels of some angiogenesis sequences are temporarily induced or 
diminished during the switch to the angiogenesis phenotype, followed by a return to baseline 
expression levels. Tables 1-8 provides genes, the mRNA expression of which varies as a 
function of time in angiogenesis tissue when compared to normal tissue. 

15 In a particularly preferred embodiment, angiogenesis sequences are those that 

are induced for a period of time, typically by positive angiogenic factors, followed by a return 
to the baseline levels. Sequences that are temporarily induced provide a means to target 
angiogenesis tissue, for example neovascularized tumors, at a particular stage of 
angiogenesis, while avoiding rapidly growing tissue that require perpetual vascularization. 

20 Such positive angiogenic factors include aFGF, JJFGF, VEGF, angiogenin and the like. 

Induced angiogenesis sequences also are further categorized with respect to 
the timing of induction. For example, some angiogenesis genes may be induced at an early 
time period, such as within 10 minutes of the induction of angiogenesis. Others may be 
induced later, such as between 5 and 60 minutes, while yet others may be induced for a time 

25 period of about two hours or more followed by a return to baseline expression levels. 

In another preferred embodiment are angiogenesis sequences that are inhibited 
or reduced as a function of time followed by a return to "normal" expression levels. 
Inhibitors of angiogenesis are examples of molecules that have this expression profile. These 
sequences also can be further divided into groups depending on the timing of diminished 

30 expression. For example, some molecules may display reduced expression within 1 0 minutes 
of the induction of angiogenesis. Others may be diminished later, such as between 5 and 60 
minutes, while others may be diminished for a time period of about two hours or more 
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followed by a return to baseline. Examples of such negative angiogenic factors include 
thrombospondin and endostatin to name a few. 

In yet another preferred embodiment are angiogenesis sequences that are 
induced for prolonged periods. These sequences are typically associated with induction of 
5 angiogenesis and may participate in induction and/or maintenance of the angiogenesis 
phenotype. 

In another preferred embodiment are angiogenesis sequences, the expression 
of which is reduced or diminished for prolonged periods in angiogenic tissue. These 
sequences are typically angiogenesis inhibitors and their diminution is correlated with an 
1 0 increase in angiogenesis. 

Informatics • 
The ability to identify genes that undergo changes in expression with time 
during angiogenesis can additionally provide high-resolution, high-sensitivity datasets which 

1 5 can be used in the areas of diagnostics, therapeutics, drug development, biosensor 

development, and other related areas. For example, the expression profiles can be used in 
diagnostic or prognostic evaluation of patients with angiogenesis-associated disease. Or as 
another example, subcellular toxicological information can be generated to better direct drug 
structure and activity correlation (see, Anderson, L., "Pharmaceutical Proteomics: Targets, 

20 Mechanism, and Function," paper presented at the BBC Proteomics conference, Coronado, 
CA (June 1 1-12, 1998)). Subcellular toxicological information can also be utilized in a 
biological sensor device to predict the likely toxicological effect of chemical exposures and 
likely tolerable exposure thresholds (see, U.S. Patent No. 5,81 1,231). Similar advantages 
accrue from datasets relevant to other biomolecules and bioactive agents (e.g., nucleic acids, 

25 saccharides, lipids, drugs, and the like). 

Thus, in another embodiment, the present invention provides a database that 
includes at least one set of data assay data. The data contained in the database is acquired , 
e.g., using array analysis either singly or in a library format The database can be in 
substantially any form in which data can be maintained and transmitted, but is preferably an 

30 electronic database. The electronic database of the invention can be maintained on any 
electronic device allowing for the storage of and access to the database, such as a personal 
computer, but is preferably distributed on a wide area network, such as the World Wide Web. 
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The focus of the present section on databases that include peptide sequence 
data is for clarity of illustration only. It will be apparent to those of skiU in the art that similar 
databases can be assembled for any assay data acquired using an assay of the invention. 

The compositions and methods for identifying and/or quantitating the relative 
5 and/or absolute abundance of a variety of molecular and macromolecular species from a 
biological sample undergoing angiogenesis, Le. f the identification of angiogenesis-associated 
sequences described herein, provide an abundance of information, which can be correlated 
with pathological conditions, predisposition to disease, drug testing, therapeutic monitoring, 
gene-disease causal linkages, identification of correlates of immunity and physiological 

10 status, among others. Although the data generated from the assays of the invention is suited 
for manual review and analysis, in a preferred embodiment, prior data processing using high- 
speed computers is utilized. 

An array of methods for indexing and retrieving biomolecular information is 
known in the art. For example, U.S. Patents 6,023,659 and 5,966,712 disclose a relational 

1 5 database system for storing biomolecular sequence information in a manner that allows 
sequences to be catalogued and searched according to one or more protein function 
hierarchies. U.S. Patent 5,953,727 discloses a relational database having sequence records 
containing information in a format that allows a collection of partial-length DNA sequences 
to be catalogued and searched according to association with one or more sequencing projects 

20 for obtaining full-length sequences from the collection of partial length sequences. U.S. 
Patent 5,706,498 discloses a gene database retrieval system for making a retrieval of a gene 
sequence similar to a sequence data item in a gene database based on the degree of similarity 
between a key sequence and a target sequence. U.S. Patent 5,538,897 discloses a method 
using mass spectroscopy fragmentation patterns of peptides to identify amino acid sequences 

25 in computer databases by comparison of predicted mass spectra with experimentally-derived 
mass spectra using a closeness-of-fit measure. U.S. Patent 5,926,818 discloses a multi- 
dimensional database comprising a functionality for multi-dimensional data analysis 
described as on-line analytical processing (OLAP), which entails the consolidation of 
projected and actual data according to more than one consolidation path or dimension. U.S. 

30 Patent 5,295,261 reports a hybrid database structure in which the fields of each database 
record are divided into two classes, navigational and informational data, with navigational 
fields stored in a hierarchical topological map which can be viewed as a tree structure or as 
the merger of two or more such tree structures. 
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The present invention provides a computer database comprising a computer 
and software for storing in computer-retrievable form assay data records cross-tabulated, eg., 
with data specifying the source of the target-containing sample from which each sequence 
specificity record was obtained. 
5 In an exemplary embodiment, at least one of the sources of target-containing 

sample is from a control tissue sample known to be free of pathological disorders. In a 
variation, at least one of the sources is a known pathological tissue specimen, e.g., a 
neoplastic lesion or another tissue specimen to be analyzed for angiogenesis. In another 
variation, the assay records cross-tabulate one or more of the following parameters for each 

10 target species in a sample: (1) a unique identification code, which can include, eg., a target 
molecular structure and/or characteristic separation coordinate (e.g., electrophoretic 
coordinates); (2) sample source; and (3) absolute and/or relative quantity of the target species 
present in the sample. 

The invention also provides for the storage and retrieval of a collection of 

15 target data in a computer data storage apparatus, which can include magnetic disks, optical 
disks, magneto-optical disks, DRAM, SRAM, SGRAM, SDRAM, RDRAM, DDR RAM, 
magnetic bubble memory devices, and other data storage devices, including CPU registers 
and on-CPU data storage arrays. Typically, the target data records are stored as a bit pattern 
in an array of magnetic domains on a magnetizable medium or as an array of charge states or 

20 transistor gate states, such as an array of cells in a DRAM device (eg., each cell comprised of 
a transistor and a charge storage area, which may be on the transistor). In one embodiment, 
the invention provides such storage devices, and computer systems built therewith, 
comprising a bit pattern encoding a protein expression fingerprint record comprising unique 
identifiers for at least 10 target data records cross-tabulated with target source. 

25 When the target is a peptide or nucleic acid, the invention preferably provides 

a method for identifying related peptide or nucleic acid sequences, comprising performing a 
computerized comparison between a peptide or nucleic acid sequence assay record stored in 
or retrieved from a computer storage device or database and at least one other sequence. The 
comparison can include a sequence analysis or comparison algorithm or computer program 

30 embodiment thereof (eg., FASTA, TFASTA, GAP, BESTFIT) and/or the comparison may 
be of the relative amount of a peptide or nucleic acid sequence in a pool of sequences 
determined from a polypeptide or nucleic acid sample of a specimen. 

The invention also preferably provides a magnetic disk, such as an IBM- 
compatible (DOS, Windows, Windows95/98/2000, Windows NT, OS/2) or other format 
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1 

(e.g. , Linux, SunOS, Solaris, ATX, SCO Unix, VMS, MV, Macintosh, etc.} floppy diskette or 
hard (fixed, Winchester) disk drive, comprising a bit pattern encoding data from an assay of 
the invention in a file format suitable for retrieval and processing in a computerized sequence 
analysis, comparison, or relative quantitation method 

The invention also provides a network, comprising a plurality of computing 
devices linked via a data link, such as an Ethernet cable (coax or lOBaseT), telephone line, 
ISDN line, wireless network, optical fiber, or other suitable signal tranmission medium, 
whereby at least one network device (e.g., computer, disk array, etc.) comprises a pattern of 
magnetic domains (e.g., magnetic disk) and/or charge domains (e.g., an array of DRAM 
cells) composing a bit pattern encoding data acquired from an assay of the invention. 

The invention also provides a method for transmitting assay data that includes 
generating an electronic signal on an electronic communications device, such as a modem, 
ISDN terminal adapter, DSL, cable modem, ATM switch, or the like, wherein the signal 
includes (in native or encrypted format) a bit pattern encoding data from an assay or a 
database comprising a plurality of assay results obtained by the method of the invention. 

In a preferred embodiment, the invention provides a computer system for 
comparing a query target to a database containing an array of data structures, such as an assay 
result obtained by the method of the invention, and ranking database targets based on the 
degree of identity and gap weight to the target data. A central processor is preferably 
initialized to load and execute the computer program for alignment and/or comparison of the 
assay results. Data for a query target is entered into the central processor via an I/O device. 
Execution of the computer program results in the central processor retrieving the assay data 
from the data file, which comprises a binary description of an assay result. 

The target data or record and the computer program can be transferred to 
secondary memory, which is typically random access memory (e.g. 9 DRAM, SRAM, 
SGRAM, or SDRAM). Targets are ranked according to the degree of correspondence 
between a selected assay characteristic (e.g., binding to a selected affinity moiety) and the 
same characteristic of the query target and results are output via an I/O device. For example, 
a central processor can be a conventional computer (e.g., Intel Pentium, PowerPC, Alpha, 
PA-8000, SPARC, MIPS 4400, MIPS 10000, VAX, etc.); a program can be a commercial or 
public domain molecular biology software package (e.g., UWGCG Sequence Analysis 
Software, Darwin); a data file can be an optical or magnetic disk, a data server, a memory 
device (e.g. 9 DRAM, SRAM, SGRAM, SDRAM, EPROM, bubble memory, flash memory, 
etc.); an I/O device can be a terminal comprising a video display and a keyboard, a modem, 
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an ISDN terminal adapter, an Ethernet port, a punched card reader, a magnetic strip reader, or 
other suitable I/O device. 

The invention also preferably provides the use of a computer system, such as 
that described above, which comprises: (1) a computer, (2) a stored bit pattern encoding a 
5 collection of peptide sequence specificity records obtained by the methods of the invention, 
which may be stored in the computer; (3) a comparison target, such as a query target; and (4) 
a program for alignment and comparison, typically with rank-ordering of comparison results 
on the basis of computed similarity values. 

1 0 Angiogenesis-associated sequences 

Angiogenesis proteins of the present invention may be classified as secreted 
proteins, transmembrane proteins or intracellular proteins. In one embodiment,the 
angiogenesis protein is an intracellular protein. Intracellular proteins may be found in the 
cytoplasm and/or in the nucleus or associated with the intracellular side of the plasma 

15 membrane. Intracellular proteins are involved in all aspects of cellular function and 

replication (including, e.g., signaling pathways); aberrant expression of such proteins often 
results in unregulated or disregulated cellular processes (see, e.g., Molecular Biology of the 
Cell, 3rd Edition, Alberts, Ed., Garland Pub., 1994). For example, many intracellular 
proteins have enzymatic activity such as protein kinase activity, protein phosphatase activity, 

20 protease activity, nucleotide cyclase activity, polymerase activity and the like. Intracellular 
proteins also serve as docking proteins that are involved in organizing complexes of proteins, 
or targeting proteins to various subcellular localizations, and are involved in maintaining the 
structural integrity of organelles. 

An increasingly appreciated concept in characterizing proteins is the presence 

25 in the proteins of one or more motifs for which defined functions have been attributed. In 
addition to the highly conserved sequences found in the enzymatic domain of proteins, highly 
conserved sequences have been identified in proteins that are involved in protein-protein 
interaction. For example, Src-homology-2 (SH2) domains bind tyrosine-phosphorylated 
targets in a sequence dependent manner. PTB domains, which are distinct from SH2 

30 domains, also bind tyrosine phosphorylated targets. SH3 domains bind to proline-rich 

targets. In addition, PH domains, tetratricdpeptide repeats and WD domains to name only a 
few, have been shown to mediate protein-protein interactions. Some of these may also be 
involved in binding to phospholipids or other second messengers. As will be appreciated by 
one of ordinary skill in the art, these motifs can be identified on the basis of primary 

27 



WO 02/079492 PCT/US02/04915 

sequence; thus, an analysis of the sequence of proteins may provide insight into both the 
enzymatic potential of the molecule and/or molecules with which the protein may associate. 

In another embodiment, the angiogenesis sequences are transmembrane 
proteins. Transmembrane proteins are molecules that span a phospholipid bilayer of a cell. 
5 They may have an intracellular domain, an extracellular domain, or both. The intracellular 
domains of such proteins may have a number of functions including those already described 
for intracellular proteins. For example, the intracellular domain may have enzymatic activity 
and/or may serve as a binding site for additional proteins. Frequently the intracellular 
domain of transmembrane proteins serves both roles. For example certain receptor tyrosine 

10 kinases have both protein kinase activity and SH2 domains. In addition, autophosphorylation 
of tyrosines on the receptor molecule itself creates binding sites for additional SH2 domain 
containing proteins. 

Transmembrane proteins may contain from one to many transmembrane 
domains. For example, receptor tyrosine kinases, certain cytokine receptors, receptor 

1 5 guanylyl cyclases and receptor serine/threonine protein kinases contain a single 

transmembrane domain. However, various other proteins including channels and adenylyl 
cyclases contain numerous transmembrane domains. Many important cell surface receptors 
such as G protein coupled receptors (GPCRs) are classified as "seven transmembrane 
domain" proteins, as they contain 7 membrane spanning regions. Characteristics of 

20 transmembrane domains include approximately 20 consecutive hydrophobic amino acids that 
may be followed or flanked by charged amino acids. Therefore, upon analysis of the amino 
acid sequence of a particular protein, the localization and number of transmembrane domains 
within the protein may be predicted (see, e.g. PSORT web site http://psort.nibb.ac.jp/). 

The extracellular domains of transmembrane proteins are diverse; however, 

25 conserved motifs are found repeatedly among various extracellular domains. Conserved 
structure and/or functions have been ascribed to different extracellular motifs. Many 
extracellular domains are involved in binding to other molecules. In one aspect, extracellular 
domains are found on receptors. Factors that bind the receptor domain include circulating 
ligands, which may be peptides, proteins, or small molecules such as adenosine and the like. 

30 For example, growth factors such as EGF, FGF and PDGF are circulating growth factors that 
bind to their cognate receptors to initiate a variety of cellular responses. Other factors include 
cytokines, mitogenic factors, neurotrophic factors and the like. Extracellular domains also 
bind to cell-associated molecules. In this respect, they mediate cell-cell interactions. Cell- 
associated ligands can be tethered to the cell for example via a glycosylphosphatidylinositol 
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(GPI) anchor, or may themselves be transmembrane proteins. Extracellular domains also 
associate with the extracellular matrix and contribute to the maintenance of the cell structure. 

Angiogenesis proteins that are transmembrane are particularly preferred in the 
present invention as they are readily accessible targets for immunotherapeutics, as are 
5 described herein. In addition, as outlined below, transmembrane proteins can be also useful 
in imaging modalities. Antibodies may be used to label such readily accessible proteins in 
situ. Alternatively, antibodies can also label intracellular proteins, in which case samples are 
typically penneablized to provide acess to intracellular proteins. 

It will also be appreciated by those in the art that a transmembrane protein can 

10 be made soluble by removing transmembrane sequences, for example through recombinant 
methods. Furthermore, transmembrane proteins that have been made soluble can be made to 
be secreted through recombinant means by adding an appropriate signal sequence. 

In another embodiment, the angiogenesis proteins are secreted proteins; the 
secretion of which can be either constitutive or regulated. These proteins have a signal 

15 peptide or signal sequence that targets the molecule to the secretory pathway. Secreted 

proteins are involved in numerous physiological events; by virtue of their circulating nature, 
they serve to transmit signals to various other cell types. The secreted protein may function in 
an autocrine manner (acting on the cell that secreted the factor), a paracrine manner (acting 
on cells in close proximity to the cell that secreted the factor) or an endocrine manner (acting 

20 on cells at a distance). Thus secreted molecules find use in modulating or altering numerous 
aspects of physiology. Angiogenesis proteins that are secreted proteins are particularly 
preferred in the present invention as they serve as good targets for diagnostic markers, e.g., 
for blood or serum tests. 

An angiogenesis sequence is typically initially identified by substantial nucleic 

25 acid and/or amino acid sequence homology or linkage to the angiogenesis sequences outlined 
herein. Such homology can be based upon the overall nucleic acid or amino acid sequence, 
and is generally determined as outlined below, using either homology programs or 
hybridization conditions. Typically, linked sequences on a mRNA are found on the same 
molecule. 

30 As detailed in the definitions, percent identity can be determined using an 

algorithm such as BLAST. A preferred method utilizes the BLASTN module of WU- 
BLAST-2 set to the default parameters, with overlap span and overlap fraction set to 1 and 
0.125, respectively. The alignment may include the introduction of gaps in the sequences to 
be aligned. In addition, for sequences which contain either more or fewer nucleotides than 
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those of the nucleic acids of the figures, it is understood that the percentage of homology will 
be determined based on the number of homologous nucleosides in relation to the total number 
of nucleosides. Thus, for example, homology of sequences shorter than those of the 
sequences identified herein and as discussed below, will be determined using the number of 
5 nucleosides in the shorter sequence. 

In one embodiment, the nucleic acid homology is determined through 
hybridization studies. Thus, e.g., nucleic acids which hybridize under high stringency to a 
nucleic acid of Tables 1-8 , or its complement, or is also found on naturally occurring 
mRNAs is considered an angiogenesis sequence. In another embodiment, less stringent 
10 hybridization conditions are used; for example, moderate or low stringency conditions may 
be used, as are known in the art; see Ausubel, supra , and Tijssen, supra. 

In addition, the angiogenesis nucleic acid sequences of the invention, e.g, the 
sequence in Tables 1-8 , are fragments of larger genes, i.e. they are nucleic acid segments. 
"Genes" in this context includes coding regions, non-coding regions, and mixtures of coding 
15 and non-coding regions. Accordingly, as will be appreciated by those in the art, using the 
sequences provided herein, extended sequences, in either direction, of the angiogenesis genes 
can be obtained, using techniques well known in the art for cloning either longer sequences or 
the full length sequences; see Ausubel, et al. y supra. Much can be done by informatics and 
many sequences can be clustered to include multiple sequences, e.g, systems such as 
20 UniGene (see, http://www.ncbi.nlm.nih.gov/UniGene/). 

Once the angiogenesis nucleic acid is identified, it can be cloned and, if 
necessary, its constituent parts recombined to form the entire angiogenesis nucleic acid 
coding regions or the entire mRNA sequence. Once isolated from its natural source, e.g., 
contained within a plasmid or other vector or excised therefrom as a linear nucleic acid 
25 segment, the recombinant angiogenesis nucleic acid can be further-used as a probe to identify 
and isolate other angiogenesis nucleic acids, for example extended coding regions. It can 
also be used as a "precursor" nucleic acid to make modified or variant angiogenesis nucleic 
acids and proteins. 

The angiogenesis nucleic acids of the present invention are used in several 
30 ways. In a first embodiment, nucleic acid probes to the angiogenesis nucleic acids are made 
and attached to biochips to be used in screening and diagnostic methods, as outlined below, 
or for administration, for example for gene therapy, vaccine, and/or antisense applications. 
Alternatively, the angiogenesis nucleic acids that include coding regions of angiogenesis 
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proteins can be put into expression vectors for the expression of angiogenesis proteins, again 
for screening purposes or for administration to a patient 

In a preferred embodiment, nucleic acid probes to angiogenesis nucleic acids 
(both the nucleic acid sequences outlined in the figures and/or the complements thereof) are 
5 made. The nucleic acid probes attached to the biochip are designed to be substantially 
complementary to the angiogenesis nucleic acids, z.e. the target sequence (either the target 
sequence of the sample or to other probe sequences, for example in sandwich assays), such 
that hybridization of the target sequence and the probes of the present invention occurs. As 
outlined below, this complementarity need not be perfect; there may be any number of base 

1 0 pair mismatches which will interfere with hybridization between the target sequence and the 
single stranded nucleic acids of the present invention. However, if the number of mutations 
is so great that no hybridization can occur under even the least stringent of hybridization 
conditions, the sequence is not a complementary target sequence. Thus, by "substantially 
complementary" herein is meant that the probes are sufficiently complementary to the target 

1 5 sequences to hybridize under normal reaction conditions, particularly high stringency 
conditions, as outlined herein. 

A nucleic acid probe is generally single stranded but can be partially single 
and partially double stranded. The strandedness of the probe is dictated by the structure, 
composition, and properties of the target sequence. In general, the nucleic acid probes range 

20 from about 8 to about 100 bases long, with from about 10 to about 80 bases being preferred, 
and from about 30 to about 50 bases being particularly preferred. That is, generally whole 
genes are not used. In some embodiments, much longer nucleic acids can be used, up to 
hundreds of bases. 

In a preferred embodiment, more than one probe per sequence is used, with 
25 either overlapping probes or probes to different sections of the target being used That is, 
two, three, four or more probes, with three being preferred, are used to build in a redundancy 
for a particular target The probes can be overlapping (z.e. have some sequence in common), 
or separate. In some cases, PCR primers may be used to amplify signal for higher sensitivity. 
As will be appreciated by those in the art, nucleic acids can be attached or 
30 immobilized to a solid support in a wide variety of ways. By "immobilized" and grammatical 
equivalents herein is meant the association or binding between the nucleic acid probe and the 
solid support is sufficient to be stable under the conditions of binding, washing, analysis, and 
removal as outlined below. The binding can typically be covalent or non-covalent By "non- 
covalent binding" and grammatical equivalents herein is meant one or more of electrostatic, 
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hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent 
attachment of a molecule, such as, streptavidin to the support and the non-covalent binding of 
the biotinylated probe to the streptavidin. By "covalent binding" and grammatical 
equivalents herein is meant that the two moieties, the solid support and the probe, are 
5 attached by at least one bond, including sigma bonds, pi bonds and coordination bonds. 
Covalent bonds can be formed directly between the probe and the solid support or can be 
formed by a cross linker or by inclusion of a specific reactive group on either the solid 
support or the probe or both molecules. Immobilization may also involve a combination of 
covalent and non-covalent interactions. 
10 In general, the probes are attached to the biochip in a wide variety of ways, as 

will be appreciated by those in the art. As described herein, the nucleic acids can either be 
synthesized first, with subsequent attachment to the biochip, or can be directly synthesized on 
the biochip. 

The biochip comprises a suitable solid substrate. By "substrate" or "solid 

15 support" or other grammatical equivalents herein is meant a material that can be modified to 
contain discrete individual sites appropriate for the attachment or association of the nucleic 
acid probes and is amenable to at least one detection method. As will be appreciated by those 
in the art, the number of possible substrates are very large, and include, but are not' limited to, 
glass and modified or functionalized glass, plastics (including acrylics, polystyrene and 

20 copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, 

polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica- 
based materials including silicon and modified silicon, carbon, metals, inorganic glasses, 
plastics, etc. In general, the substrates allow optical detection and do not appreciably 
fluorescese. A preferred substrate is described in copending application entitled Reusable 

25 Low Fluorescent Plastic Biochip, U.S. Application Serial No. 09/270,214, filed March 15, 
1999, herein incorporated by reference in its entirety. 

Generally the substrate is planar, although as will be appreciated by those in 
the art, other configurations of substrates may be used as well. For example, the probes may 
be placed on the inside surface of a tube, for flow-through sample analysis to minimize 

30 sample volume. Similarly, the substrate may be flexible, such as a flexible foam, including 
closed cell foams made of particular plastics. 

In a preferred embodiment, the surface of the biochip and the probe may be 
derivatized with chemical functional groups for subsequent attachment of the two. Thus, for 
example, the biochip is derivatized with a chemical functional group including, but not 
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limited to, amino groups, carboxy groups, oxo groups and thiol groups, with amino groups 
being particularly preferred. Using these functional groups, the probes can be attached using 
functional groups on the probes. For example, nucleic acids containing amino groups can be 
attached to surfaces comprising amino groups, for example using linkers as are known in the 
5 art; for example, homo-or hetero-bifiinctional linkers as are well known (see 1994 Pierce 
Chemical Company catalog, technical section on cross-linkers, pages 155-200, incorporated 
herein by reference). In addition, in some cases, additional linkers, such as alkyl groups 
(including substituted and heteroalkyl groups) may be used. 

In this embodiment, oligonucleotides are synthesized as is known in the art, 

10 and then attached to the surface of the solid support. As will be appreciated by those skilled 
in the art, either the 5 f or 3' tenninus may be attached to the solid support, or attachment may 
be via an internal nucleoside. 

In another embodiment, the immobilization to the solid support may be very 
strong, yet non-covalent. For example, biotinylated oligonucleotides can be made, which 

1 5 bind to surfaces covalently coated with streptavidin, resulting in attachment. 

Alternatively, the oligonucleotides may be synthesized on the surface, as is 
known in the art. For example, photoactivation techniques utilizing photopolymerization 
compounds and techniques are used. In a preferred embodiment, the nucleic acids can be 
synthesized in situ, using well known photolithographic techniques, such as those described 

20 in WO 95/251 16; WO 95/35505; U.S. Patent Nos. 5,700,637 and 5,445,934; and references 
cited within, all of which are expressly incorporated by reference; these methods of 
attachment form the basis of the Affimetrix GeneChip™ technology. 

Often, amplification-based assays are performed to measure the expression 
level of angiogenesis-associated sequences. These assays are typically performed in 

25 conjunction with reverse transcription. In such assays, an angiogenesis-associated nucleic 
acid sequence acts as a template in an amplification reaction (e.g., Polymerase Chain 
Reaction, or PCR). In a quantitative amplification, the amount of amplification product will 
be proportional to the amount of template in the original sample. Comparison to appropriate 
controls provides a measure of the amount of angiogenesis-associated RNA Methods of 

30 quantitative amplification are well known to those of skill in the art. Detailed protocols for 
quantitative PCR are provided, e.g., in Innis et al (1990) PCR Protocols, A Guide to Methods 
and Applications, Academic Press, Inc. N.Y.). 

In some embodiments, a TaqMan based assay is used to measure expression. 
TaqMan based assays use a fluorogenic oligonucleotide probe that contains a 5' fluorescent 
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dye and a 3 * quenching agent. The probe hybridizes to a PCR product, but cannot itself be 
extended due to a blocking agent at the 3' end When the PCR product is amplified in 
subsequent cycles, the 5* nuclease activity of the polymerase, e.g., AmpliTaq, results in the 
cleavage of the TaqMan probe. This cleavage separates the 5' fluorescent dye and the 3' 
5 quenching agent, thereby resulting in an increase in fluorescence as a function of 

amplification (see, for example, literature provided by Perkin-Elmer, e.g., www2.perkin- 
elmer.com). 

Other suitable amplification methods include, but are not limited to, ligase 
chain reaction (LCR) (see, Wu and Wallace (1989) Genomics 4: 560, Landegren et al (1988) 
10 Science 241 : 1077, and Barringer et al (1990) Gene 89: 117), transcription amplification 

(Kwoh et al (1989) Proc. Natl Acad. Sci USA 86: 1 173), self-sustained sequence replication 
(Guatelli et al (1990) Proc. Nat Acad Set USA 87: 1874), dot PCR, and linker adapter PCR, 
etc. 

In a preferred embodiment, angiogenesis nucleic acids, e.g., encoding 

15 angiogenesis proteins are used to make a variety of expression vectors to express 

angiogenesis proteins which can then be used in screening assays, as described below. 
Expression vectors and recombinant DNA technology are well known to those of skill in the 
art (see, e.g., Ausubel, supra, and Gene Expression Systems, Fernandez & Hoeffler, Eds, 
Academic Press, 1999) and are used to express proteins. The expression vectors may be 

20 either self-replicating extrachromosomal vectors or vectors which integrate into a host 
genome. Generally, these expression vectors include transcriptional and translational 
regulatory nucleic acid operably linked to the nucleic acid encoding the angiogenesis protein. 
The term "control sequences" refers to DNA sequences used for the expression of an 
operably linked coding sequence in a particular host organism. Control sequences that are 

25 suitable for prokaryotes, for example, include a promoter, optionally an operator sequence, 
and a ribosome binding site. Eukaiyotic cells are known to utilize promoters, 
polyadenylation signals, and enhancers. 

Nucleic acid is "operably linked" when it is placed into a functional 
relationship with another nucleic acid sequence. For example, DNA for a presequence or 

30 secretory leader is operably linked to DNA for a polypeptide if it is expressed as a preprotein 
that participates in the secretion of the polypeptide; a promoter or enhancer is operably linked 
to a coding sequence if it affects the transcription of the sequence; or a ribosome binding site 
is operably linked to a coding sequence if it is positioned so as to facilitate translation. 
Generally, "operably linked" means that the DNA sequences being linked are contiguous, 
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and, in the case of a secretory leader, contiguous and in reading phase. However, enhancers 
do not have to be contiguous. Linking is typically accomplished by ligation at convenient 
restriction sites. If such sites do not exist, synthetic oligonucleotide adaptors or linkers are 
used in accordance with conventional practice. Transcriptional and translation^ regulatory 
5 nucleic acid will generally be appropriate to the host cell used to express the angiogenesis 
protein; for example, transcriptional and translational regulatory nucleic acid sequences from 
Bacillus are preferably used to express the angiogenesis protein in Bacillus. Numerous types 
of appropriate expression vectors, and suitable regulatory sequences are known in the art for 
a variety of host cells. 

10 In general, transcriptional and translational regulatory sequences may include, 

but are not limited to, promoter sequences, ribosomal binding sites, transcriptional start and 
stop sequences, translational start and stop sequences, and enhancer or activator sequences. 
In a preferred embodiment, the regulatory sequences include a promoter and transcriptional 
start and stop sequences. 

15 Promoter sequences encode either constitutive or inducible promoters. The 

promoters may be either naturally occurring promoters or hybrid promoters. Hybrid 
promoters, which combine elements of more than one promoter, are also known in the art, 
and are useful in the present invention. 

In addition, an expression vector may comprise additional elements. For 

20 example, the expression vector may have two replication systems, thus allowing it to be 

maintained in two organisms, for example in mammalian or insect cells for expression and in 
a procaryotic host for cloning and amplification. Furthermore, for integrating expression 
vectors, the expression vector contains at least one sequence homologous to the host cell 
genome, and preferably two homologous sequences which flank the expression construct. 

25 The integrating vector may be directed to a specific locus in the host cell by selecting the 
appropriate homologous sequence for inclusion in the vector. Constructs for integrating 
vectors are well known in the art (e.g., Fernandez & Hoeffler, supra). See also Kitamura, et 
al. (1995) PNAS 92:9146-9150. 

In addition, in a preferred embodiment, the expression vector contains a 

30 selectable marker gene to allow the selection of transformed host cells. Selection genes are 
well known in the ait and will vary with the host cell used. 

The angiogenesis proteins of the present invention are produced by culturing a 
host cell transformed with an expression vector containing nucleic acid encoding an 
angiogenesis protein, under the appropriate conditions to induce or cause expression of the 
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angiogenesis protein. Conditions appropriate for angiogenesis protein expression will vary 
with the choice of the expression vector and the host cell, and will be easily ascertained by 
one skilled in the art through routine experimentation or optimization. For example, the use 
of constitutive promoters in the expression vector will require optimizing the growth and 
5 proliferation of the host cell, while the use of an inducible promoter requires the appropriate 
growth conditions for induction. In addition, in some embodiments, the timing of the harvest 
is important. For example, the baculoviral systems used in insect cell expression are lytic 
viruses, and thus harvest time selection can be crucial for product yield. 

Appropriate host cells include yeast, bacteria, archaebacteria, fungi, and insect 

10 and animal cells, including mammalian cells. Of particular interest are Saccharomyces 
cerevisiae and other yeasts, E. coli, Bacillus subtilis, Sf9 cells, C129 cells, 293 cells, 
Neurospora, BHK, CHO, COS, HeLa cells, HUVEC (human umbilical vein endothelial 
cells), THP1 cells (a macrophage cell line) and various other human cells and cell lines. 

In a preferred embodiment, the angiogenesis proteins are expressed in 

15 mammalian cells. Mammalian expression systems are also known in the art, and include 
retroviral and adenoviral systems. Of particular use as mammalian promoters are the 
promoters from mammalian viral genes, since the viral genes are often highly expressed and 
have a broad host range. Examples include the SV40 early promoter, mouse mammary tumor 
virus LTR promoter, adenovirus major late promoter, herpes simplex virus promoter, and the 

20 CMV promoter (see, e.g. 9 Fernandez & Hoeffler, supra). Typically, transcription termination 
and polyadenylation sequences recognized by mammalian cells are regulatory regions located 
3 1 to the translation stop codon and thus, together with the promoter elements, flank the 
coding sequence. Examples of transcription terminator and polyadenlytion signals include 
those derived form SV40. 

25 The methods of introducing exogenous nucleic acid into mammalian hosts, as 

well as other hosts, is well known in the art, and will vary with the host cell used. 
Techniques include dextran-mediated transfection, calcium phosphate precipitation, 
polybrene mediated transfection, protoplast fusion, electroporation, viral infection, 
encapsulation of the polynucleotide^) in liposomes, and direct microinjection of the DNA 

30 into nuclei. 

In a preferred embodiment, angiogenesis proteins are expressed in bacterial 
systems. Bacterial expression systems are well known in the art. Promoters from 
bacteriophage may also be used and are known in the art In addition, synthetic promoters 
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and hybrid promoters are also useful; for example, the tac promoter is a hybrid of the tip and 
lac promoter sequences. Furthermore, a bacterial promoter can include naturally occurring 
promoters of non-bacterial origin that have the ability to bind bacterial RNA polymerase and 
initiate transcription. In addition to a functioning promoter sequence, an efficient ribosome 
5 binding site is desirable. The expression vector may also include a signal peptide sequence 
that provides for secretion of the angiogenesis protein in bacteria. The protein is either 
secreted into the growth media (gram-positive bacteria) or into the periplasmic space, located 
between the inner and outer membrane of the cell (gram-negative bacteria). The bacterial 
expression vector may also include a selectable maiker gene to allow for the selection of 

10 bacterial strains that have been transformed Suitable selection genes include genes which 
render the bacteria resistant to drugs such as ampicillin, chloramphenicol, erythromycin, 
kanamycin, neomycin and tetracycline. Selectable markers also include biosynthetic genes, 
such as those in the histidine, tryptophan and leucine biosynthetic pathways. These 
components are assembled into expression vectors. Expression vectors for bacteria are well 

15 known in the art, and include vectors for Bacillus subtilis, E. coli, Streptococcus cremoris, 
and Streptococcus lividans, among others (e.g. 9 Fernandez & Hoeffler, supra). The bacterial 
expression vectors are transformed into bacterial host cells using techniques well known in 
the art, such as calcium chloride treatment, electroporation, and others. 

In one embodiment, angiogenesis proteins are produced in insect cells. 

20 Expression vectors for the transformation of insect cells, and in particular, baculovirus-based 
expression vectors, are well known in the art. 

In a preferred embodiment, angiogenesis protein is produced in yeast cells. 
Yeast expression systems are well known in the art, and include expression vectors for 
Saccharomyces cerevisiae t Candida albicans and C. maltosa, Hansenula polymorpha, 

25 Kluyveromyces fragilis and K. lactis t Pichia guillerimondii and P. pastoris, 
Schizosaccharomyces pombe, and Yarrowia Upolytica. 

The angiogenesis protein may also be made as a fusion protein, using 
techniques well known in the art. Thus, for example, for the creation of monoclonal 
antibodies, if the desired epitope is small, the angiogenesis protein may be fused to a carrier 

30 protein to form an immunogen. Alternatively, the angiogenesis protein may be made as a 
fusion protein to increase expression, or for other reasons. For example, when the 
angiogenesis protein is an angiogenesis peptide, the nucleic acid encoding the peptide may be 
linked to another nucleic acid for expression purposes. Fusion with detection epitope tags 
can be made, e.g., with FLAG, His 6, myc, HA, etc. 
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In one embodiment, the angiogenesis nucleic acids, proteins and antibodies of 
the invention are labeled By 'labeled" herein is meant that a compound has at least one 
element, isotope or chemical compound attached to enable the detection of the compound In 
general, labels fall into three classes: a) isotopic labels, which may be radioactive or heavy 
5 isotopes; b) immune labels, which may be antibodies, antigens, or epitope tags and c) colored 
or fluorescent dyes. The labels may be incorporated into the angiogenesis nucleic acids, 
proteins and antibodies at any position. For example, the label should be capable of 
producing, either directly or indirectly, a detectable signal. The detectable moiety may be a 
radioisotope, such as 3 H, 14 C, 32 P, 35 S, or l25 I, a fluorescent or chemiluminescent compound, 

10 such as fluorescein isothiocyanate, rhodamine, or luciferin, or an enzyme, such as alkaline 
phosphatase, beta-galactosidase or horseradish peroxidase. Any method known in the art for 
conjugating the antibody to the label may be employed, including those methods described by 
Hunter et al., Nature, 144:945 (1962); David et al., Biochemistry, 13:1014 (1974); Pain et 
al., J. Immunol. Meth., 40:219 (1981); and Nygren, J. Histochem. and Cytochem., 30:407 

15 (1982). 

Accordingly, the present invention also provides angiogenesis protein 
sequences. An angiogenesis protein of the present invention may be identified in several 
ways. "Protein" in this sense includes proteins, polypeptides, and peptides. As will be 
appreciated by those in the art, the nucleic acid sequences of the invention can be used to 

20 generate protein sequences. There are a variety of ways to do this, including cloning the 
entire gene and verifying its frame and amino acid sequence, or by comparing it to known 
sequences to search for homology to provide a frame, assuming the angiogenesis protein has 
an identifiable motif or homology to some protein in the database being used. Generally, the 
nucleic acid sequences are input into a program that will search all three frames for 

25 homology. This is done in a preferred embodiment using the following NCBI Advanced 

BLAST parameters. The program is blastx or blastn. The database is nr. The input data is as 
"Sequence in FASTA format". The organism list is ''none". The "expect" is 10; the filter is 
default. The "descriptions" is 500, the "alignments" is 500, and the "alignment view" is 
pairwise. The "Query Genetic Codes" is standard (1). The matrix is BLOSUM62; gap 

30 existence cost is 1 1 , per residue gap cost is 1 ; and the lambda ratio is .85 default. This 
results in the generation of a putative protein sequence. 

Also included within one embodiment of angiogenesis proteins are amino acid 
variants of the naturally occurring sequences, as determined herein. Preferably, the variants 
are preferably greater than about 75% homologous to the wild-type sequence, more 

38 



WO 02/079492 PCT/US02/04915 

preferably greater than about 80%, even more preferably greater than about 85% and most 
preferably greater than 90%. In some embodiments the homology will be as high as about 93 
to 95 or 98%. As for nucleic acids, homology in this context means sequence similarity or 
identity, with identity being preferred. This homology will be determined using standard 
5 techniques well known in the art as are outlined above for the nucleic acid homologies. 

Angiogenesis proteins of the present invention may be shorter or longer than 
the mid type amino acid sequences. Thus, in a preferred embodiment, included within the 
definition of angiogenesis proteins are portions or'fragments of the wild type sequences, 
herein. In addition, as outlined above, the angiogenesis nucleic acids of the invention may be 

10 used to obtain additional coding regions, and thus additional protein sequence, using 
techniques known in the art. 

In a preferred embodiment, the angiogenesis proteins are derivative or variant 
angiogenesis proteins as compared to the wild-type sequence. That is, as outlined more folly 
below, the derivative angiogenesis peptide will often contain at least one amino acid 

1 5 substitution, deletion or insertion, with amino acid substitutions being particularly preferred. 
The amino acid substitution, insertion or deletion may occur at any residue within the 
angiogenesis peptide. 

Also included within one embodiment of angiogenesis proteins of the present 
invention are amino acid sequence variants. These variants typically fall into one or more of 

20 three classes: substitutional, insertional or deletional variants. These variants ordinarily are 
prepared by site specific mutagenesis of nucleotides in the DNA encoding the angiogenesis 
protein, using cassette or PCR mutagenesis or other techniques well known in the art, to 
produce DNA encoding the variant, and thereafter expressing the DNA in recombinant cell 
culture as outlined above. However, variant angiogenesis protein fragments having up to 

25 about 1 00- 1 50 residues may be prepared by in vitro synthesis using established techniques. 
Amino acid sequence variants are characterized by the predetermined nature of the variation, 
a feature that sets them apart from naturally occurring allelic or interspecies variation of the 
angiogenesis protein amino acid sequence. The variants typically exhibit the same qualitative 
biological activity as the naturally occurring analogue, although variants can also be selected 

30 which have modified characteristics as will be more folly outlined below. 

While the site or region for introducing an amino acid sequence variation is 
predetermined, the mutation per se need not be predetermined. For example, in order to 
optimize the performance of a mutation at a given site, random mutagenesis may be 
conducted at the target codon or region and the expressed angiogenesis variants screened for 
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the optimal combination of desired activity. Techniques for making substitution mutations at 
predetermined sites in DNA having a known sequence are well known, for example, M13 
primer mutagenesis and PCR mutagenesis. Screening of the mutants is done using assays of 
angiogenesis protein activities. 
5 Amino acid substitutions are typically of single residues; insertions usually 

will be on the order of from about 1 to 20 amino acids, although considerably larger 
insertions may be tolerated. Deletions range from about 1 to about 20 residues, although in 
some cases deletions may be much larger. 

Substitutions, deletions, insertions or any combination thereof may be used to 

10 arrive at a final derivative. Generally these changes are done on a few amino acids to 

minimize the alteration of the molecule. However, larger changes may be tolerated in certain 
circumstances. When small alterations in the characteristics of the angiogenesis protein are 
desired, substitutions are generally made in accordance with the amino acid substitution chart 
provided in the definition section. 

15 Substantial changes in function or immunological identity are made by 

selecting substitutions that are less conservative than those provided in the definition of 
"conservative substitution". For example, substitutions may be made which more 
significantly affect: the structure of the polypeptide backbone in the area of the alteration, for 
example the alpha-helical or beta-sheet structure; the charge or hydrophobicity of the 

20 molecule at the target site; or the bulk of the side chain. The substitutions which in general 
are expected to produce the greatest changes in the polypeptide's properties are those in 
which (a) a hydrophilic residue, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic 
residue, e.g. ieucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is 
substituted for (or by) any other residue; (c) a residue having an electropositive side chain, 

25 e.g. lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g. 
glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g. phenylalanine, is 
substituted for (or by) one not having a side chain, e.g. glycine. 

The variants typically exhibit the same qualitative biological activity and will 
elicit the same immune response as the naturally-occurring analog, although variants also are 

30 selected to modify the characteristics of the angiogenesis proteins as needed. Alternatively, 
the variant may be designed such that the biological activity of the angiogenesis protein is 
altered. For example, glycosylation sites may be altered or removed. 

Covalent modifications of angiogenesis polypeptides are included within the 
scope of this invention. One type of covalent modification includes reacting targeted amino 
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acid residues of an angiogenesis polypeptide with an organic derivatizing agent that is 
capable of reacting with selected side chains or the N-or C-terminal residues of an 
angiogenesis. polypeptide. Derivatization with bifunctional agents is useful, for instance, for 
crosslinking angiogenesis polypeptides to a water-insoluble support matrix or surface for use 
5 in the method for purifying anti-angiogenesis polypeptide antibodies or screening assays, as 
is more fully described below. Commonly used crosslinking agents include, e.g., 1,1- 
bis(diazoacetyl)-2-phenylethane, glutaraldehyde, N-hydroxysuccinimide esters, for example, 
esters with 4-azidosalicylic acid, homobifunctional imidoesters, including disuccinimidyl 
esters such as 3,3'-ditMobis(succinimidylpropionate), bifunctional maleimides such as bis-N- 

10 maleimido-l,8-octane and agents such as methyl-3-[(p-azidophenyl)dithio]propioimidate. 

Other modifications include deamidation of glutaminyl and asparaginyl 
residues to the corresponding glutamyl and aspartyl residues, respectively, hydroxylation of 
proline and lysine, phosphorylation of hydroxyl groups of seryl, threonyl or tyrosyl residues, 
methylation of the y-amino groups of lysine, arginine, and histidine side chains [T.E. 

15 Creighton, Proteins: Structure and Molecular Properties, W.H. Freeman & Co., San 

Francisco, pp. 79-86 (1983)], acetylation of the N-terminal amine, and amidation of any C- 
terminal carboxyl group. 

Another type of covalent modification of the angiogenesis polypeptide 
included within the scope of this invention comprises altering the native glycosylation pattern 

20 of the polypeptide. "Altering the native glycosylation pattern 11 is intended for purposes herein 
to mean deleting one or more carbohydrate moieties found in native sequence angiogenesis 
polypeptide, and/or adding one or more glycosylation sites that are not present in the native 
sequence angiogenesis polypeptide. Glycosylation patterns can be altered in many ways. For 
example the use of different cell types to express angiogenesis-associated sequences can 

25 result in different glycosylation patterns. 

Addition of glycosylation sites to angiogenesis polypeptides may also be 
accomplished by altering the amino acid sequence thereof. The alteration may be made, for 
example, by the addition of, or substitution by, one or more serine or threonine residues to the 
native sequence angiogenesis polypeptide (for O-linked glycosylation sites). The 

30 angiogenesis amino acid sequence may optionally be altered through changes at the DNA 
level, particularly by mutating the DNA encoding the angiogenesis polypeptide at preselected 
bases such that codons are generated that will translate into the desired amino acids. 
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Another means of increasing the number of carbohydrate moieties on the 
angiogenesis polypeptide is by chemical or enzymatic coupling of glycosides to the 
polypeptide. Such methods are described in the art, e.g., in WO 87/05330 published 1 1 
September 1987, and in Aplin and Wriston, CRC Crit Rev. Biochem., pp. 259-306 (1981). 
5 Removal of carbohydrate moieties present on the angiogenesis polypeptide 

may be accomplished chemically or enzymatically or by mutational substitution of codons 
encoding for amino acid residues that serve as targets for glycosylation. Chemical 
deglycosylation techniques are known in the art and described, for instance, by Hakimuddin, 
et al., Arch. Biochem. Biophys., 259:52 (1987) and by Edge et aL, Anal. Biochem., 118:131 

10 (1981). Enzymatic cleavage of carbohydrate moieties on polypeptides can be achieved by the 
use of a variety of endo-and exo-glycosidases as described by Thotakura et al., Meth. 
Enzymol., 138:350(1987). 

Another type of covalent modification of angiogenesis comprises linking the 
angiogenesis polypeptide to one of a variety of nonproteinaceous polymers, e.g., 

1 5 polyethylene glycol, polypropylene glycol, or polyoxyalkylenes, in the manner set forth in 
U.S. Patent Nos. 4,640,835; 4,496,689; 4,301,144; 4,670,417; 4,791,192 or 4,179,337. 

Angiogenesis polypeptides of the present invention may also be modified in a 
way to form chimeric molecules comprising an angiogenesis polypeptide fused to another, 
heterologous polypeptide or amino acid sequence. In one embodiment, such a chimeric 

20 molecule comprises a fusion of an angiogenesis polypeptide with a tag polypeptide which 
provides an epitope to which an anti-tag antibody can selectively bind. The epitope tag is 
generally placed at the amino-or carboxyl-terminus of the angiogenesis polypeptide. The 
presence of such epitope-tagged forms of an angiogenesis polypeptide can be detected using 
an antibody against the tag polypeptide. Also, provision of the epitope tag enables the 

25 angiogenesis polypeptide to be readily purified by affinity purification using an anti-tag 
antibody or another type of affinity matrix that binds to the epitope tag. In an alternative 
embodiment, the chimeric molecule may comprise a fusion of an angiogenesis polypeptide 
with an immunoglobulin or a particular region of an immunoglobulin. For a bivalent form of 
the chimeric molecule, such a fusion could be to the Fc region of an IgG molecule. 

30 Various tag polypeptides and their respective antibodies are well known in the 

art Examples include poly-histidine (poly-his) or poly-histidine-glycine (poly-his-gly) tags; 
HIS6 and metal chelation tags, the flu HA tag polypeptide and its antibody 12CA5 [Field et 
al., MoL Cell Biol, 8:2159-2165 (1988)]; the c-myc tag and the 8F9, 3C7, 6E10, G4, B7 and 
9E10 antibodies thereto [Evan et al., Molecular and Cellular Biology, 5:3610-3616 (1985)]; 
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and the Herpes Simplex virus glycoprotein D (gD) tag and its antibody [Paborsky et al, 
Protein Engineering, 3(6):547-553 (1990)]. Other tag polypeptides include the Flag-peptide 
[Hopp et al, BioTechnology, 6:1204-1210 (1988)]; the KT3 epitope peptide [Martin et al, 
Science, 255:192-194 (1992)]; tubulin epitope peptide [Skinner et al, J. Biol Chem., 
5 266:15163-15166 (1991)]; and the 17 gene 10 protein peptide tag [Lutz-Freyeimuth et al, 
Proa Natl Acad Sci. USA, 87:6393-6397 (1990)]. 

Also included with an embodiment of angiogenesis protein are other 
angiogenesis proteins of the angiogenesis family, and angiogenesis proteins from other 
organisms, which are cloned and expressed as outlined below. Thus, probe or degenerate 

10 polymerase chain reaction (PCR) primer sequences may be used to find other related 

angiogenesis proteins from humans or other organisms. As will be appreciated by those in 
the art, particularly useful probe and/or PCR primer sequences include the unique areas of the 
angiogenesis nucleic acid sequence. As is generally known in the art, preferred PCR primers 
are from about 15 to about 35 nucleotides in length, with from about 20 to about 30 being 

15 preferred, and may contain inosine as needed. The conditions for the PCR reaction are well 
known in the art (e.g., Innis, PCR Protocols, supra). 

In addition, as is outlined herein, angiogenesis proteins can be made that are 
longer than those encoded by the nucleic acids of the figures, e.g., by the elucidation of 
extended sequences, the addition of epitope or purification tags, the addition of other fusion 

20 sequences, etc. 

Angiogenesis proteins may also be identified as being encoded by 
angiogenesis nucleic acids. Thus, angiogenesis proteins are encoded by nucleic acids that 
will hybridize to the sequences of the sequence listings, or their complements, as outlined 
herein. 

25 In a preferred embodiment, when the angiogenesis protein is to be used to 

generate antibodies, e.g., for immunotherapy or immunodiagnosis, the angiogenesis protein 
should share at least one epitope or determinant with the fiill length protein. By "epitope" or 
"determinant" herein is typically meant a portion of a protein which will generate and/or bind 
an antibody or T-cell receptor in the context of MHC. Thus, in most instances, antibodies 

30 made to a smaller angiogenesis protein will be able to bind to the full-length protein, 
particularly linear epitopes. In a preferred embodiment, the epitope is unique; that is, 
antibodies generated to a unique epitope show little or no cross-reactivity . In a preferred 
embodiment, the epitope is selected from a protein sequence set out in Table 8. 
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Methods of preparing polyclonal antibodies are known to the skilled artisan 
(e.g., Coligan, supra; and Harlow & Lane, supra). Polyclonal antibodies can be raised in a 
mammal, e.g. y by one or more injections of an immunizing agent and, if desired, an adjuvant. 
Typically, the immunizing agent and/or adjuvant will be injected in the mammal by multiple 
5 subcutaneous or intraperitoneal injections. The immunizing agent may include a protein 
encoded by a nucleic acid of the figures or fragment thereof or a fusion protein thereof. It 
may be useful to conjugate the immunizing agent to a protein known to be immunogenic in 
the mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and soybean 

1 0 trypsin inhibitor. Examples of adjuvants which may be employed include Freund ! s complete 
adjuvant and MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose 
dicorynomycolate). The immunizat ion protocol may be selected by one skilled in the art 
without undue experimentation. 

The antibodies may, alternatively, be monoclonal antibodies. Monoclonal 

1 5 antibodies may be prepared using hybridoma methods, such as those described by Kohler and 
Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, or other 
appropriate host animal, is typically immuniz ed with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to 
the immunizing agent Alternatively, the lymphocytes may be immunized in vitro. The 

20 immunizing agent will typically include a polypeptide encoded by a nucleic acid of Tables 1 - 
8 , or fragment thereof, or a fusion protein thereof. Generally, either peripheral blood 
lymphocytes ("PBLs") are used if cells of human origin are desired, or spleen cells or lymph 
node cells are used if non-human mammalian sources are desired. The lymphocytes are then 
fused with an immortalized cell line using a suitable fusing agent, such as polyethylene 

25 glycol, to form a hybridoma cell [Goding, Monoclonal Antibodies: Principles and Practice, 
Academic Press, (1986) pp. 59-103]. Immortalized cell lines are usually transformed 
mammalian cells, particularly myeloma cells of rodent, bovine and human origin. Usually, 
rat or mouse myeloma cell lines are employed. The hybridoma cells may be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 

30 growth or survival of the unfiised, immortalized cells. For example, if the parental cells lack 
the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture 
medium for the hybridomas typically will include hypoxanthine, aminopterin, and thymidine 
("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
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In one embodiment, the antibodies are bispecific antibodies. Bispecific 
antibodies are monoclonal, preferably human or humanized, antibodies that have binding 
specificities for at least two different antigens or that have binding specificities for two 
epitopes on the same antigen. In one embodiment, one of the binding specificities is for a 
5 protein encoded by a nucleic acid Tables 1-8 or a fragment thereof the other one is for any 
other antigen, and preferably for a cell-surface protein or receptor or receptor subunit, 
preferably one that is tumor specific. Alternatively, tetramer-type technology may create 
multivalent reagents. 

In a preferred embodiment, the antibodies to angiogenesis protein are capable 

10 of reducing or eliminating a biological function of an angiogenesis protein, as is described 
below. That is, the addition of anti-angiogenesis protein antibodies (either polyclonal or 
preferably monoclonal) to angiogenic tissue (or cells containing angiogenesis) may reduce or 
eliminate the angiogenesis activity. Generally, at least a 25% decrease in activity is 
preferred, with at least about 50% being particularly preferred and about a 95-100% decrease 

1 5 being especially prefen-ecL 

In a preferred embodiment the antibodies to the angiogenesis proteins are 
humanized antibodies (e.g., Xenerex Biosciences, Mederex, Inc., Abgenix, Inc., Protein 
Design Labsjnc.) Humanized forms of non-human (e.g., murine) antibodies are chimeric 
molecules of immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, 

20 Fab, Fab', F(ab')2 or other antigen-binding subsequences of antibodies) which contain 

minimal sequence derived from non-human immunoglobulin. Humanized antibodies include 
human immunoglobulins (recipient antibody) in which residues form a complementary 
determining region (CDR) of the recipient are replaced by residues from a CDR of a non- 
human species (donor antibody) such as mouse, rat or rabbit having the desired specificity, 

25 affinity and capacity. In some instances, Fv framework residues of the human 

immunoglobulin are replaced by corresponding non-human residues. Humanized antibodies 
may also comprise residues which are found neither in the recipient antibody nor in the 
imported CDR or framework sequences. In general, a humanized antibody will comprise 
substantially all of at least one, and typically two, variable domains, in which all or 

30 substantially all of the CDR regions correspond to those of a non-human immunoglobulin 
and all or substantially all of the framework (FR) regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise 
at least a portion of an immunoglobulin constant region (Fc), typically that of a human 
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immunoglobulin [Jones et al., Nature, 321:522-525 (1986); Riechmann et al., Nature, 
332:323-329 (1988); andPresta, Curr. Op. Struct. Biol., 2:593-596 (1992)]. 

Methods for humanizing non-human antibodies are well known in the art. 
Generally, a humanized antibody has one or more amino acid residues introduced into it from 
5 a source which is non-human. These non-human amino acid residues are often referred to as 
import residues, which are typically taken from an import variable domain. Humanization 
can be essentially performed following the method of Winter and co-workers [Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Veihoeyen et 
al., Science, 239:1534-1536 (1988)], by substituting rodent CDRs or CDR sequences for the 

10 corresponding sequences of a human antibody. Accordingly, such humanized antibodies are 
chimeric antibodies (U.S. Patent No. 4,816,567), wherein substantially less than an intact 
human variable domain has been substituted by the corresponding sequence from a non- 
human species. In practice, humanized antibodies are typically human antibodies in which 
some CDR residues and possibly some FR residues are substituted by residues from 

1 5 analogous sites in rodent antibodies. 

Human antibodies can also be produced using various techniques known in the 
art, including phage display libraries [Hoogenboom and Winter, J. Mol. Biol., 227:381 
(1991); Marks et al., J. Mol. Biol., 222:581 (1991)]. The techniques of Cole et al. and 
Boerner et al. are also available for the preparation of human monoclonal antibodies (Cole et 

20 al., Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, p. 77 (1985) and Boerner et 
al., J. Immunol, 147(l):86-95 (1991)]. Similarly, human antibodies can be made by 
introducing of human immunoglobulin loci into transgenic animals, e.g., mice in which the 
endogenous immunoglobulin genes have been partially or completely inactivated. Upon 
challenge, human antibody production is observed, which closely resembles that seen in 

25 humans in all respects, including gene rearrangement, assembly, and antibody repertoire. 
This approach is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806; 
5,569,825; 5,625,126; 5,633,425; 5,661,016, and in the following scientific publications: 
Marks et al., Bio/Technology 10, 779-783 (1992); Lonberg et al., Nature 368 856-859 (1994); 
Morrison, Nature 368, 812-13 (1994); Fishwild et al., Nature Biotechnology 14, 845-51 

30 (1 996); Neuberger, Nature Biotechnology 14, 826 (1996); Lonberg and Huszar, Intern. Rev. 
Immunol. 13 65-93 (1995). 

By immunotherapy is meant treatment of angiogenesis with an antibody raised 
against angiogenesis proteins. As used herein, immunotherapy can be passive or active. 
Passive immunotherapy as defined herein is the passive transfer of antibody to a recipient 
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(patient). Active immunization is the induction of antibody and/or T-cell responses in a 
recipient (patient). Induction of an immune response is the result of providing the recipient 
with an antigen to which antibodies are raised As appreciated by one of ordinary skill in the 
art, the antigen may be provided by injecting a polypeptide against which antibodies are 
5 desired to be raised into a recipient, or contacting the recipient with a nucleic acid capable of 
expressing the antigen and under conditions for expression of the antigen, leading to an 
immune response. 

In a preferred embodiment the angiogenesis proteins against which antibodies 
are raised are secreted proteins as described above. Without being bound by theory, 

10 antibodies used for treatment, bind and prevent the secreted protein from binding to its 
receptor, thereby inactivating the secreted angiogenesis protein. 

In another preferred embodiment, the angiogenesis protein to which antibodies 
are raised is a transmembrane protein. Without being bound by theory, antibodies used for 
treatment, bind the extracellular domain of the angiogenesis protein and prevent it from 

15 binding to other proteins, such as circulating ligands or cell-associated molecules. The 

antibody may cause down-regulation of the transmembrane angiogenesis protein. As will be 
appreciated by one of ordinary skill in the art, the antibody may be a competitive, non- 
competitive or uncompetitive inhibitor of protein binding to the extracellular domain of the 
angiogenesis protein. The antibody is also an antagonist of the angiogenesis protein. 

20 Further, the antibody prevents activation of the transmembrane angiogenesis protein. In one 
aspect, when the antibody prevents the binding of other molecules to the angiogenesis 
protein, the antibody prevents growth of the cell. The antibody may also be used to target or 
sensitize the cell to cytotoxic agents, including, but not limited to TNF-a, TNF-p, IL-1, INF-y 
and IL-2, or chemotherapeutic agents including 5FU, vinblastine, actinomycin D, cisplatin, 

25 methotrexate, and the like. In some instances the antibody belongs to a sub-type that 
activates serum complement when complexed with the transmembrane protein thereby 
mediating cytotoxicity or antigen-dependent cytotoxicity (ADCC). Thus, angiogenesis is 
treated by administering to a patient antibodies directed against the transmembrane 
angiogenesis protein. Antibody-labeling may activate a co-toxin, localize a toxin payload, or 

30 otherwise provide means to locally ablate cells. 

in another preferred embodiment, the antibody is conjugated or fused to an 
effector moiety. The effector moiety can be any number of molecules, including labelling 
moieties such as radioactive labels or fluorescent labels, or can be a therapeutic moiety. In 
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one aspect the therapeutic moiety is a small molecule that modulates the activity of the 
angiogenesis protein. In another aspect the therapeutic moiety modulates the activity of 
molecules associated with or in close proximity to the angiogenesis protein. The therapeutic 
moiety may inhibit enzymatic activity such as protease or collagenase activity associated with 
5 angiogenesis, or be an attractant of other cells, such as NK cells. 

In a preferred embodiment, the therapeutic moiety can also be a cytotoxic 
agent. In this method, targeting the cytotoxic agent to angiogenesis tissue or cells, results in a 
reduction in the number of afflicted cells, thereby reducing symptoms associated with 
angiogenesis. Cytotoxic agents are numerous and varied and include, but are not limited to, 

10 cytotoxic drugs or toxins or active fragments of such toxins. Suitable toxins and their 

corresponding fragments include diphtheria A chain, exotoxin A chain, ricin A chain, abrin A 
chain, curcin, crotin, phenomycin, enomycin and the like. Cytotoxic agents also include 
radiochemicals made by conjugating radioisotopes to antibodies raised against angiogenesis 
proteins, or binding of a radionuclide to a chelating agent that has been covalently attached to 

15 the antibody. Targeting the therapeutic moiety to transmembrane angiogenesis proteins not 
only serves to increase the local concentration of therapeutic moiety in the angiogenesis 
afflicted area, but also serves to reduce deleterious side effects that may be associated with 
the therapeutic moiety. 

In another preferred embodiment, the angiogenesis protein against which the 

20 antibodies are raised is an intracellular protein. In this case, the antibody may be conjugated 
or fused to a protein which facilitates entry into the cell. In one case, the antibody enters the 
cell by endocytosis. In another embodiment, a nucleic acid encoding the antibody is 
administered to the individual or cell. Moreover, wherein the angiogenesis protein can be 
targeted within a cell, i.e., the nucleus, an antibody thereto contains a signal for that target 

25 localization, i.e., a nuclear localization signal. 

The angiogenesis antibodies of the invention specifically bind to angiogenesis 
proteins. By "specifically bind" herein is meant that the antibodies bind to the protein with a 
K<j of at least about 0. 1 mM, more usually at least about 1 nM, preferably at least about 0. 1 
pM or better, and most preferably, 0.01 \M or better. Selectivity of binding is also 

30 important 

Id a preferred embodiment, the angiogenesis protein is purified or isolated 
after expression. Angiogenesis proteins may be isolated or purified in a variety of ways 
known to those skilled in the art depending on what other components are present in the 
sample. Standard purification methods include electrophoretic, molecular, immunological 
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and chromatographic techniques, including ion exchange, hydrophobic, affinity, and reverse- 
phase HPLC chromatography, and chromatofocusing. For example, the angiogenesis protein 
may be purified using a standard anti-angiogenesis protein antibody column. Ultrafiltration 
and diafiltration techniques, in conjunction with protein concentration, are also useful. For 
5 general guidance in suitable purification techniques, see Scopes, R., Protein Purification, 
Springer- Verlag, NY (1982). The degree of purification necessary will vary depending on 
the use of the angiogenesis protein. In some instances no purification will be necessary. 

Once expressed and purified if necessary, the angiogenesis proteins and 
nucleic acids are useful in a number of applications. They may be used as immunoselection 
10 reagents, as vaccine reagents, as screening agents, etc. 

Detection of angiogenesis sequence for diagnostic and therapeutic applications 

In one aspect, the RNAexpression levels of genes are determined for different 
cellular states in the angiogenesis phenotype. Expression levels of genes in normal tissue 

15 (i.e., not undergoing angiogenesis) and in angiogenesis tissue (and in some cases, for varying 
severities of angiogenesis that relate to prognosis, as outlined below) are evaluated to provide 
expression profiles. An expression profile of a particular cell state or point of development is 
essentially a "fingerprint" of the state. While two states may have any particular gene 
similarly expressed, the evaluation of a number of genes simultaneously allows the 

20 generation of a gene expression profile that is reflective of the state of the cell. By comparing 
expression profiles of cells in different states, information regarding which genes are 
important (including both up- and down-regulation of genes) in each of these states is 
obtained. Then, diagnosis may be performed or confirmed to determine whether a tissue 
sample has the gene expression profile of normal or angiogenesic tissue. This will provide 

25 for molecular diagnosis of related conditions. 

'Differential expression," or grammatical equivalents as used herein, refers to 
qualitative or quantitative differences in the temporal and/or cellular gene expression 
patterns within and among cells and tissue. Thus, a differentially expressed gene can 
qualitatively have its expression altered, including an activation or inactivation, in, e.g., 

30 normal versus angiogenic tissue. Genes may be turned on or turned off in a particular state, 
relative to another state thus permitting comparison of two or more statese. A qualitatively 
regulated gene will exhibit an expression pattern within a state or cell type which is 
detectable by standard techniques. Some genes will be expressed in one state or cell type, but 
not in both. Alternatively, the difference in expression may be quantitative, e.g., in that 
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expression is increased or decreased; i.e., gene expression is either upregulated, resulting in 
an increased amount of transcript, or downregulated, resulting in a decreased amount of 
transcript. The degree to which expression differs need only be large enough to quantify via 
standard characterization techniques as outlined below, such as by use of Affymetrix 
5 GeneChip™ expression arrays, Lockhart, Nature Biotechnology, 14:1675-1680 (1996), 

hereby expressly incorporated by reference. Other techniques include, but are not limited to, 
quantitative reverse transcriptase PCR, Northern analysis and RNase protection. As outlined 
above, preferably the change in expression (i.e., upregulation or downregulation) is at least 
about 50%, more preferably at least about 100%, more preferably at least about 150%, more 

10 preferably at least about 200%, with from 300 to at least 1000% being especially preferred. 

Evaluation may be at the gene transcript, or the protein level. The amount of 
gene expression may be monitored using nucleic acid probes to the DNA or RNA equivalent 
of the gene transcript, and the quantification of gene expression levels, or, alternatively, the 
final gene product itself (protein) can be monitored, e.g., with antibodies to the angiogenesis 

1 5 protein and standard immunoassays (ELIS As, etc.) or other techniques, including mass 
spectroscopy assays, 2D gel electrophoresis assays, etc. Proteins corresponding to 
angiogenesis genes, z.e., those identified as being important in an angiogenesis phenotype, 
can be evaluated in an angiogenesis diagnostic test. 

In a preferred embodiment, gene expression monitoring is performed 

20 simultaneously on a number of genes. Multiple protein expression monitoring can be 

performed as well. Similarly, these assays may be performed on an individual basis as well. 

In this embodiment, the angiogenesis nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of angiogenesis sequences in a 
particular cell. The assays are further described below in the example. PCR techniques can 

25 be used to provide greater sensitivity. 

In a preferred embodiment nucleic acids encoding the angiogenesis protein are 
detected. Although DNA or RNA encoding the angiogenesis protein may be detected, of 
particular interest are methods wherein an mRNA encoding an angiogenesis protein is 
detected. Probes to detect mRNA can be a nucleotide/deoxynucleotide probe that is 

30. complementary to and hybridizes with the mRNA and includes, but is not limited to, 

oligonucleotides, cDNA or RNA. Probes also should contain a detectable label, as defined 
herein. In one method the mRNA is detected after immobilizing the nucleic acid to be 
examined on a solid support such as nylon membranes and hybridizing the probe with the 
sample. Following washing to remove the non-specifically bound probe, the label is 
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detected In another method detection of the mRNA is performed in situ. In this method 
permeabilized cells or tissue samples are contacted with a detectably labeled nucleic acid 
probe for sufficient time to allow the probe to hybridize with the target mRNA. Following 
washing to remove the non-specifically bound probe, the label is detected. For example a 
digoxygenin labeled riboprobe (RNA probe) that is complementary to the mRNA encoding 
an angiogenesis protein is detected by binding the digoxygenin with an anti-digoxygenin 
secondary antibody and developed with nitro blue tetrazolium and 5-bromo-4-chloro-3- 
indoyl phosphate. 

In a preferred embodiment, various proteins from the three classes of proteins 
as described herein (secreted, transmembrane or intracellular proteins) are used in diagnostic 
assays. The angiogenesis proteins, antibodies, nucleic acids, modified proteins and cells 
containing angiogenesis sequences are used in diagnostic assays. This can be performed on 
an individual gene or corresponding polypeptide level. In a preferred embodiment, the 
expression profiles are used, preferably in conjunction with high throughput screening 
techniques to allow monitoring for expression profile genes and/or corresponding 
polypeptides. 

As described and defined herein, angiogenesis proteins, including 
intracellular, transmembrane or secreted proteins, find use as markers of angiogenesis. 
Detection of these proteins in putative angiogenesis tissue allows for detection or diagnosis of 
angiogenesis. In one embodiment, antibodies are used to detect angiogenesis proteins. A 
preferred method separates proteins from a sample by electrophoresis on a gel (typically a 
denaturing and reducing protein gel, but may be another type of gel, including isoelectric 
focusing gels and the like). Following separation of proteins, the angiogenesis protein is 
detected, e.g.> by immunoblotting with antibodies raised against the angiogenesis protein. 
Methods of immunoblotting are well known to those of ordinary skill in the art. 

In another preferred method, antibodies to the angiogenesis protein find use in 
in situ imaging techniques, e,g. 9 in histology (e.g. 9 Methods in Cell Biology: Antibodies in 
Cell Biology, volume 37 (Asai, ed. 1993)). In this method cells are contacted with from one 
to many antibodies to the angiogenesis protein(s). Following washing to remove non-specific 
antibody binding, the presence of the antibody or antibodies is detected. In one embodiment 
the antibody is detected by incubating with a secondary antibody that contains a detectable 
label. In another method the primary antibody to the angiogenesis protein(s) contains a 
detectable label, for example an enzyme marker that can act on a substrate. In another 
preferred embodiment each one of multiple primary antibodies contains a distinct and 
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detectable label. This method finds particular use in simultaneous screening for a plurality of 
angiogenesis proteins. As will be appreciated by one of ordinary skill in the art, many other 
histological imaging techniques are alsoprovided by the invention. 

In a preferred embodiment the label is detected in a fluorometer which has the 
5 ability to detect and distinguish emissions of different wavelengths. In addition, a 
fluorescence activated cell sorter (FACS) can be used in the method. 

In another preferred embodiment, antibodies find use in diagnosing 
angiogenesis from biological samples, such as blood, urine, sputum, or other bodily fluids. 
As previously described, certain angiogenesis proteins are secreted/circulating molecules. 

10 Blood samples, therefore, are useful as samples to be probed or tested for the presence of 
secreted angiogenesis proteins. Antibodies can be used to detect an angiogenesis protein by 
previously described immunoassay techniques including ELIS A, immunoblotting (Western 
blotting), immunoprecipitation, BIACOKE technology and the like. Conversely, the presence 
of antibodies may indicate an immune response against an endogenous angiogenesis protein. 

15 In a preferred embodiment, in situ hybridization of labeled angiogenesis 

nucleic acid probes to tissue arrays is done. For example, arrays of tissue samples, including 
angiogenesis tissue and/or normal tissue, are made. In situ hybridization (see, e.g., Ausubel, 
supra) is then performed When comparing the fingerprints between an individual and a 
standard, the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the 

20 findings. It is further understood that the genes which indicate the diagnosis may differ from 
those which indicate the prognosis and molecular profiling of the condition of the cells may 
lead to distinctions between responsive or refractory conditions or may be predictive of 
outcomes. 

In a preferred embodiment, the angiogenesis proteins, antibodies, nucleic 
25 acids, modified proteins and cells containing angiogenesis sequences are used in prognosis 
assays. As above, gene expression profiles can be generated that correlate to angiogenesis 
severity, in terms of long term prognosis. Again, this may be done on either a protein or gene 
level, with the use of genes being preferred. As above, angiogenesis probes may be attached 
to biochips for the detection and quantification of angiogenesis sequences in a tissue or 
30 patient The assays proceed as outlined above for diagnosis. PCR method may provide more 
sensitive and accurate quantification. 

In a preferred embodiment members of the three classes of proteins as 
described herein are used in drug screening assays. The angiogenesis proteins, antibodies, 
nucleic acids, modified proteins and cells containing angiogenesis sequences are used in drug 
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screening assays or by evaluating the effect of drug candidates on a "gene expression profile" 
or expression profile of polypeptides. In a preferred embodiment, the expression profiles are 
used, preferably in conjunction with high throughput screening techniques to allow 
monitoring for expression profile genes after treatment with a candidate agent (e.g., 
5 Zlokarnik, et al., Science 279, 84-8 (1998); Heid, Genome Res 6:986-94, 1996). 

In a preferred embodiment, the angiogenesis proteins, antibodies', nucleic 
acids, modified proteins and cells containing the native or modified angiogenesis proteins are 
used in screening assays. That is, the present invention provides novel methods for screening 
for compositions which modulate the angiogenesis phenotype or an identified physiological 

10 function of an angiogenesis protein. As above, this can be done on an individual gene level 
or by evaluating the effect of drug candidates on a "gene expression profile". In a preferred 
embodiment, the expression profiles are used, preferably in conjunction with high throughput 
screening techniques to allow monitoring for expression profile genes after treatment with a 
candidate agent, see Zlokarnik, supra. 

1 5 Having identified the differentially expressed genes herein, a variety of assays 

may be executed In a preferred embodiment, assays may be run on an individual gene or 
protein level. That is, having identified a particular gene as up regulated in angiogenesis, test 
compounds can be screened for the ability to modulate gene expression or for binding to the 
angiogenic protein. 'Modulation" thus includes both an increase and a decrease in gene 

20 expression. The preferred amount of modulation will depend on the original change of the 
gene expression in normal versus tissue undergoing angiogenesis, with changes of at least 
10%, preferably 50%, more preferably 100-300%, and in some embodiments 300-1000% or 
greater. Thus, if a gene exhibits a 4-fold increase in angiogenic tissue compared to normal 
tissue, a decrease of about four-fold is often desired; similarly, a 10-fold decrease in 

25 angiogenic tissue compared to normal tissue often provides a target value of a 10-fold 
increase in expression to be induced by the test compound. 

The amount of gene expression may be monitored using nucleic acid probes 
and the quantification of gene expression levels, or, alternatively, the gene product itself can 
be monitored, e.g., through the use of antibodies to the angiogenesis protein and standard 

30 immunoassays. Proteomics and separation techniques may also allow quantification of 
expression. 

In a preferred embodiment, gene expression or protein monitoring of a number 
of entitites, Le. 9 an expression profile, is monitored simultaneously. Such profiles will 
typically invove a plurality of those entitites described herein.. 
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In this embodiment, the angiogenesis nucleic acid probes are attached to 
biochips as outlined herein for the detection and quantification of angiogenesis sequences in a 
particular cell. Alternatively, PCR may be used Thus, a series, e.g., of microtiter plate, may 
be used with dispensed primers in desired wells. A PCR reaction can then be perfonned and 
5 analyzed for each well. 

Modulators of angiogenesis 

Expression monitoring can be performed to identify compounds that modify 
the expression of one or more angiogenesis-associated sequences, e.g., a polynucleotide 
10 sequence set out in Tables 1-8 . Generally, in a preferred embodiment, a test modulator is 
added to the cells prior to analysis. Moreover, screens are also provided to identify agents 
that modulate angiogenesis, modulate angiogenesis proteins, bind to an angiogenesis protein, 
or interfere with the binding of an angiogenesis protein and an antibody or other binding 
partner. 

1 5 The term "test compound" or "drug candidate" or Modulator" or grammatical 

equivalents as used herein describes any molecule, e.g. 9 protein, oligopeptide, small organic 
molecule, polysaccharide, polynucleotide, etc., to be tested for the capacity to directly or 
indirectly alter the angiogenesis phenotype or the expression of an angiogenesis sequence, 
e.g., a nucleic acid or protein sequence. In preferred embodiments, modulators alter 

20 expression profiles, or expression profile nucleic acids or proteins provided herein. In one 
embodiment, the modulator suppresses an angiogenesis phenotype, for example to a normal 
tissue fingerprint In another embodiment, a modulator induced an angiogenesis phenotype. 
Generally, a plurality of assay mixtures are run in parallel with different agent concentrations 
to obtain a differential response to the various concentrations. Typically, one of these 

25 concentrations serves as a negative control, i.e. 9 at zero concentration or below the level of 
detection. 

In one aspect, a modulator will neutralize the effect of an angiogenesis protein. 
By "neutralize" is meant that activity of a protein is inhibited or blocked and thereby has 
substantially no effect on a cell. 
30 In certain embodiments, combinatorial libraries of potential modulators will be 

screened for an ability to bind to an angiogenesis polypeptide or to modulate activity. 
Conventionally, new chemical entities with useful properties are generated by identifying a 
chemical compound (called a 'lead compound") with some desirable property or activity, 
e.g., inhibiting activity, creating variants of the lead compound, and evaluating the property 
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and activity of those variant compounds. . Often, high throughput screening (HTS) methods 
are employed for such an analysis. 

In one preferred embodiment, high throughput screening methods involve 
providing a library containing a large number of potential therapeutic compounds (candidate 
5 compounds). Such "combinatorial chemical libraries" are then screened in one or more 
assays to identify those library members (particular chemical species or subclasses) that 
display a desired characteristic activity. The compounds thus identified can serve as 
conventional "lead compounds" or can themselves be used as potential or actual therapeutics. 
A combinatorial chemical library is a collection of diverse chemical 

1 0 compounds generated by either chemical synthesis or biological synthesis by combining a 
number of chemical "building blocks" such as reagents. For example, a linear combinatorial 
chemical library, such as a polypeptide (e.g., mutein) library, is formed by combining a set of 
chemical building blocks called amino acids in every possible way for a given compound 
length (i.e., the number of amino acids in a polypeptide compound). Millions of chemical 

1 5 compounds can be synthesized through such combinatorial mixing of chemical building 
blocks (Gallop et al (1994) /. Med Chem. 37(9): 1233-1251). 

Preparation and screening of combinatorial chemical libraries is well known to 
those of skill in the art Such combinatorial chemical libraries include, but are not limited to, 
peptide libraries (see, e.g., U.S. Patent No. 5,010,175, Furka (1991) Int. J. Pept. Prot. Res., 

20 37: 487-493, Houghton et al. (1991) Nature, 354: 84-88), peptoids (PCT Publication No WO 
91/19735, 26 Dec. 1991), encoded peptides (PCT Publication WO 93/20242, 14 Oct. 1993), 
random bio-oligomers (PCT Publication WO 92/00091, 9 Jan. 1992), benzodiazepines (U.S. 
Pat. No. 5,288,514), diversomers such as hydantoins, benzodiazepines and dipeptides (Hobbs 
etal, (1993) Proc. Nat Acad. Set USA 90: 6909-6913), vinylogous polypeptides (Hagihara 

25 et al (1992) /. Amer. Chem. Soc. 1 14: 6568), nonpeptidal peptidomimetics with a Beta-D- 
Glucose scaffolding (Hirschmann et al, (1992) J. Amer. Chem. Soc, 1 14: 9217-9218), 
analogous organic syntheses of small compound libraries (Chen et al. (1994) J. Amer. Chem. 
Soc. 116: 2661), oligocarbamates (Cho, et al., (1993) Science 261:1303), and/or peptidyl 
phosphonates (Campbell etal, (1994) J. Org. Chem. 59: 658). See, generally, Gordon et al., 

30 (1994) J. Med. Chem. 37:1385, nucleic acid libraries (see, e.g., Strategene, Corp.), peptide 
nucleic acid libraries (see, e.g. f U.S. Patent 5,539,083), antibody libraries (see, e.g., Vaughn 
et al (1996) Nature Biotechnology, 14(3): 309-314), and PCT/US96/1 0287), caibohydrate 
libraries (see, e.g., Liang etal, (1996) Science, 21 A: 1520-1522, and U.S. Patent No. 
5,593,853), and small organic molecule libraries (see, e.g., benzodiazepines, Baum (1993) 
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C&EN, Jan 18, page 33; isoprenoids, U.S. Patent No. 5,569,588; thiazolidinones and 
metathiazanones, U.S. Patent No. 5,549,974; pyrrolidines, U.S. Patent Nos. 5,525,735 and 
5,519,134; morpholino compounds, U.S. Patent No. 5,506,337; benzodiazepines, U.S. Patent 
No. 5,288,514; and the like). 
5 Devices for the preparation of combinatorial libraries are commercially 

available (see, e.g., 357 MPS, 390 MPS, Advanced Chem Tech, Louisville KY, Symphony, 
Rainin, Woburn, MA, 433A Applied Biosystems, Foster City, CA, 9050 Plus, Millipore, 
Bedford, MA). 

A number of well known robotic systems have also been developed for 
1 0 solution phase chemistries. These systems include automated workstations like the 

automated synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka, 
Japan) and many robotic systems utilizing robotic arms (Zymate H, Zymark Corporation, 
Hopkinton, Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.), which mimic the manual 
synthetic operations performed by a chemist. Any of the above devices are suitable for use 
1 5 with the present invention. The nature and implementation of modifications to these devices 
(if any) so that they can operate as discussed herein will be apparent to persons skilled in the 
relevant art. In addition, numerous combinatorial libraries are themselves commercially 
available (see, e.g., ComGenex, Princeton, N.J., Asinex, Moscow, Ru, Tripos, Inc., St Louis, 
MO, ChemStar, Ltd, Moscow, RU, 3D Pharmaceuticals, Exton, PA, Martek Biosciences, 
20 Columbia, MD, etc.). 

The assays to identify modulators are amenable to high throughput screening. 
Preferred assays thus detect enhancement or inhibition of angiogenesis gene transcription, 
inhibition or enhancement of polypeptide expression, and inhibition or enhancement of 
polypeptide activity. 

25 High throughput assays for the presence, absence, quantification, or other 

properties of particular nucleic acids or protein products are well known to those of skill in 
the art. Similarly, binding assays and reporter gene assays are similarly well known. Thus, 
for example, U.S. Patent No. 5,559,410 discloses high throughput screening methods for 
proteins, U.S. Patent No. 5,585,639 discloses high throughput screening methods for nucleic 

30 acid binding (i.e., in arrays), while U.S. Patent Nos. 5,576,220 and 5,541,061 disclose high 
throughput methods of screening for ligand/antibody binding. 

In addition, high throughput screening systems are commercially available 
(see, e.g, Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman 
Instruments, Inc. Fullerton, CA; Precision Systems, Inc., Natick, MA, etc.). These systems 
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typically automate entire procedures, including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detectors) appropriate 
for the assay. These configurable systems provide high throughput and rapid start up as well 
as a high degree of flexibility and customization. The manufacturers of such systems provide 
5 detailed protocols for various high throughput systems. Thus, for example, Zymark Corp. 
provides technical bulletins describing screening systems for detecting the modulation of 
gene transcription, ligand binding, and the like. 

In one embodiment, modulators are proteins, often naturally occurring 
proteins or fragments of naturally occurring proteins. Thus, e.g., cellular extracts containing 

10 proteins, or random or directed digests of proteinaceous cellular extracts, may be used. In 
this way libraries of proteins may be made for screening in the methods of the invention. 
Particularly preferred in this embodiment are libraries of bacterial, fungal, viral, and 
mammalian proteins, with the latter being preferred, and human proteins being especially 
preferred. Paticularly useful test compound will be directed to the class of proteins to which 

15 the target belongs, e.g, 9 substrates for enzymes or ligands and receptors. 

In a preferred embodiment, modulators are peptides of from about 5 to about 
30 amino acids, with from about 5 to about 20 amino acids being preferred, and from about 7 
to about 1 5 being particularly preferred. The peptides may be digests of naturally occurring 
proteins as is outlined above, random peptides, or "biased" random peptides. By 

20 "randomized" or grammatical equivalents herein is meant that each nucleic acid and peptide 
consists of essentially random nucleotides and amino acids, respectively. Since generally 
these random peptides (or nucleic acids, discussed below) are chemically synthesized, they 
may incorporate any nucleotide or amino acid at any position. The synthetic process can be 
designed to generate randomized proteins or nucleic acids, to allow the formation of all or 

25 most of the possible combinations over the length of the sequence, thus forming a library of 
randomized candidate bioactive proteinaceous agents. 

In one embodiment, the library is fully randomized, with no sequence 
preferences or constants at any position. In a preferred embodiment, the library is biased. 
That is, some positions within the sequence are either held constant, or are selected from a 

30 limited number of possibilities. For example, in a preferred embodiment, the nucleotides or 
amino acid residues are randomized within a defined class, for example, of hydrophobic 
amino acids, hydrophilic residues, sterically biased (either small or large) residues, towards 
the creation of nucleic acid binding domains, the creation of cysteines, for cross-linking, 
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prolines for SH-3 domains, serines, threonines, tyrosines or histidines for phosphorylation 
sites, etc., or to purines, etc. 

Modulators of angiogenesis can also be nucleic acids, as defined above. 
As described above generally for proteins, nucleic acid modulating agents may 
5 be naturally occurring nucleic acids, random nucleic acids, or "biased" random nucleic acids. 
For example, digests of procaryotic or eucaryotic genomes may be used as is outlined above 
for proteins. 

In a preferred embodiment, the candidate compounds are organic chemical 
moieties, a wide variety of which are available in the literature. 

10 After the candidate agent has been added and the cells allowed to incubate for 

some period of time, the sample containing a target sequence to be analyzed is added to the 
biochip. If required, the target sequence is prepared using known techniques. For example, 
the sample may be treated to lyse the cells, using known lysis buffers, electroporation, etc., 
with purification and/or amplification such as PGR performed as appropriate. For example, 

15 an in vitro transcription with labels covalently attached to the nucleotides is performed. 
Generally, the nucleic acids are labeled with biotin-FITC or PE, or with cy3 or cy5. 

In a preferred embodiment, the target sequence is labeled with, for example, a 
fluorescent, a chemiiuminescent, a chemical, or a radioactive signal, to provide a means of 
detecting the target sequence's specific binding to a probe. The label also can be an enzyme, 

20 such as, alkaline phosphatase or horseradish peroxidase, which when provided with an 

appropriate substrate produces a product that can be detected. Alternatively, the label can be 
a labeled compound or small molecule, such as an enzyme inhibitor, that binds but is not 
catalyzed or altered by the enzyme. The label also can be a moiety or compound, such as, an 
epitope tag or biotin which specifically binds to streptavidin. For the example of biotin, the 

25 streptavidin is labeled as described above, thereby, providing a detectable signal for the 

bound target sequence. Unbound labeled streptavidin is typically removed prior to analysis. 

As will be appreciated by those in the art, these assays can be direct 
hybridization assays or can comprise "sandwich assays", which include the use of multiple 
probes, as is generally outlined in U.S. Patent Nos. 5,681,702, 5,597,909, 5,545,730, 

30 5,594,117, 5,591,584, 5,571,670, 5,580,731, 5,571,670, 5,591,584, 5,624,802, 5,635,352, 
5,594,1 18, 5,359,100, 5,124,246 and 5,681,697, all of which are hereby incorporated by 
reference. In this embodiment, in general, the target nucleic acid is prepared as outlined 
above, and then added to the biochip comprising a plurality of nucleic acid probes, under 
conditions that allow the formation of a hybridization complex. 
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A variety of hybridization conditions may be used in the present invention, 
including high, moderate and low stringency conditions as outlined above. The assays are 
generally run under stringency conditions which allows formation of the label probe 
hybridization complex only in the presence of target. Stringency can be controlled by 
5 altering a step parameter that is a thermodynamic variable, including, but not limited to, 
temperature, formamide concentration, salt concentration, chaotropic salt concentration pH, 
organic solvent concentration, etc. 

These parameters may also be used to control non-specific binding, as is 
generally outlined in U.S. Patent No. 5,681,697. Thus it may be desirable to perform certain 
1 0 steps at higher stringency conditions to reduce non-specific binding. 

The reactions outlined herein may be accomplished in a variety of ways. 
Components of the reaction may be added simultaneously, or sequentially, in different orders, 
with preferred embodiments outlined below. In addition, the reaction may include a variety 
of other reagents. These include salts, buffers, neutral proteins, e.g. albumin, detergents, etc. 
1 5 which may be used to facilitate optimal hybridization and detection, and/or reduce non- 
specific or background interactions. Reagents that otherwise improve the efficiency of the 
assay, such as protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may also be 
used as appropriate, depending on the sample preparation methods and purity of the target. 

The assay data are analyzed to determine the expression levels, and changes in 
20 expression levels as between states, of individual genes, forming a gene expression profile. 

Screens are performed to identify modulators of the angiogenesis phenotype. 
In one embodiment, screening is performed to identify modulators that can induce or 
suppress a particular expression profile, thus preferably generating the associated phenotype. 
In another embodiment, e.g., for diagnostic applications, having identified differentially 
25 expressed genes important in a particular state, screens can be performed to identify 

modulators that alter expression of individual genes. In an another embodiment, screening is 
performed to identify modulators that alter a biological function of the expression product of 
a differentially expressed gene. Again, having identified the importance of a gene in a 
particular state, screens are performed to identify agents that bind and/or modulate the 
30 biological activity of the gene product. 

In addition screens can be done for genes that are induced in response to a 
candidate agent After identifying a modulator based upon its ability to suppress an 
angiogenesis expression pattern leading to a normal expression pattern, or to modulate a 
single angiogenesis gene expression profile so as to mimic the expression of the gene from 
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normal tissue, a screen as described above can be performed to identify genes that are 
specifically modulated in response to the agent. Comparing expression profiles between 
normal tissue and agent treated angiogenesis tissue reveals genes that are not expressed in 
normal tissue or angiogenesis tissue, but are expressed in agent treated tissue. These agent- 
5 specific sequences can be identified and used by methods described herein for angiogenesis 
genes or proteins. In particular these sequences and the proteins they encode find use in 
marking or identifying agent treated cells. In addition, antibodies can be raised against the 
agent induced proteins and used to target novel therapeutics to the treated angiogenesis tissue 
sample. 

10 Thus, in one embodiment, a test compound is administered to a population of 

angiogenic cells, that have an associated angiogenesis expression profile. By 
"administration" or "contacting" herein is meant that the candidate agent is added to the cells 
in such a manner as to allow the agent to act upon the cell, whether by uptake and 
intracellular action, or by action at the cell surface. In some embodiments, nucleic acid 

1 5 encoding a proteinaceous candidate agent (z. e. , a peptide) may be put into a viral construct 
such as an adenoviral or retroviral construct, and added to the cell, such that expression of 
the peptide agent is accomplished, eg, PCT US97/01019. Regulatable gene therapy systems 
can also be used. 

Once the test compound has been administered to the cells, the cells can be 
20 washed if desired and are allowed to incubate under preferably physiological conditions for 
some period of time. The cells are then harvested and a new gene expression profile is 
generated, as outlined herein. 

Thus, for example, angiogenesis tissue may be screened for agents that 
modulate, e.g., induce or suppress the angiogenesis phenotype. A change in at least one 
25 gene, preferably many, of the expression profile indicates that the agent has an effect on 
angiogenesis activity. By defining such a signature for the angiogenesis phenotype, screens 
for new drugs that alter the phenotype can be devised. With this approach, the drug target 
need not be known and need not be represented in the original expression screening platform, 
nor does the level of transcript for the target protein need to change. 
30 Measure of angiogenesis polypeptide activity, or of angiogenesis or the 

angiogenic phenotype can be performed using a variety of assays. For example, the effects of 
the test compounds upon the function of the anagiogenesis polypeptides can be measured by 
examining parameters described above. A suitable physiological change that affects activity 
can be used to assess the influence of a test compound on the polypeptides of this invention. 
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When the functional consequences are determined using intact cells or animals, one can also 
measure a variety of effects such as, in the case of angiogenesis associated with tumors, 
tumor growth, neovascularization, hormone release, transcriptional changes to both known 
and uncharacterized genetic markers (e.g., northern blots), changes in cell metabolism such as 
5 cell growth or pH changes, and changes in intracellular second messengers such as cGMP. In 
the assays of the invention, mammalian angiogenesis polypeptide is typically used, e.g., 
mouse, preferably human. 

A variety of angiogenesis assays are known to those of skill in the art. Various 
models have been employed to evaluate angiogenesis (e.g., Croix et al., Science 289: 1 197- 

10 1202, 2000 and Kahn et al., Amer. J. Pathol. 156:1887-1900). Assessement of angiogenesis 
in the presence of a potential modulator of angiogenesis can be performed using cell-cultre- 
based angiogenesis assays, e.g. 9 endothelial cell tube formation assays, as well as other 
bioassays such as the chick CAM assay, the mouse corneal assay, and assays measuring the 
effect of admihistering potential modulators on implanted tumors. The chick CAM assay is 

15 described by O'Reilly, et al Cell 79: 315-328, 1994. Briefly, 3 day old chicken embryos with 
intact yolks are separated from the egg and placed in a petri dish. After 3 days of incubation, 
a methylcellulose disc containing the protein to be tested is applied to the CAM of individual 
embryos. After about 48 hours of incubation, the embryos and CAMs are observed to 
determine whether endothelial growth has been inhibited. The mouse corneal assay involves 

20 implanting a growth factor-containing pellet, along with another pellet containing the 

suspected endothelial growth inhibitor, in the cornea of a mouse and observing the pattern of 
capillaries that are elaborated in the cornea. Angiogenesis can also be measured by 
determining the extent of neovascularization of a tumor. For example, carcinoma cells can be 
subcutaneously inoculated into athymic nude mice and tumor growth then monitored. The 

25 cancer cells are treated with an angiogenesis inhibitor, such as an antibody, or other 

compound that is exogenously administered, or can be transfected prior to inoculation with a 
polynucleotide inhibitor of angiogenesis. Immunoassays using endothelial cell-specific 
antibodies are typically used to stain for vascularization of tumor and the number of vessels 
in the tumor. 

30 Assays to identify compounds with modulating activity can be performed in 

vitro. For example, an angiogenesis polypeptide is first contacted with a potential modulator 
and incubated for a suitable amount of time, e.g. 9 from 0.5 to 48 hours. In one embodiment, 
the angiogenesis polypeptide levels are determined in vitro by measuring the level of protein 
or mRNA The level of protein is measured using immunoassays such as western blotting, 
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ELISA and the like with an antibody that selectively binds to the angiogenesis polypeptide or 
a fragment thereof. For measurement of mRNA, amplification, e.g., using PCR, LCR, or 
hybridization assays, e.g., northern hybridization, RNAse protection, dot blotting, are 
preferred. The level of protein or mRNA is detected using directly or indirectly labeled 
5 detection agents, e.g., fluorescently or radioactively labeled nucleic acids, radioactively or 
enzymatically labeled antibodies, and the like, as described herein. 

Alternatively, a reporter gene system can be devised using the angiogenesis 
protein promoter operably linked to a reporter gene such as luciferase, green fluorescent 
protein, CAT, or P-gal. The reporter construct is typically transfected into a cell. After 

10 treatment with a potential modulator, the amount of reporter gene transcription, translation, or 
activity is measured according to standard techniques known to those of skill in the art. 

In a preferred embodiment, as outlined above, screens may be done on 
individual genes and gene products (proteins). That is, having identified a particular 
differentially expressed gene as important in a particular state, screening of modulators of the 

1 5 expression of the gene or the gene product itself can be done. The gene products of 

differentially expressed genes are sometimes referred to herein as "angiogenesis proteins". In 
preferred embodiments the angiogenesis protein comprises a sequence shown in Table 8. 
The angiogenesis protein may be a fragment, or alternatively, be the full length protein to a 
fragment shown herein. 

20 Preferably, the angiogenesis protein is a fragment of approximately 14 to 24 

amino acids long. More preferably the fragment is a soluble fragment. In one embodiment 
an angiogenesis protein is conjugated or fused to an immunogenic agent or BS A. 

In one embodiment, screening for modulators of expression of specific genes 
is performed. Typically, the expression of only one or a few genes are evaluated. In another 

25 embodiment, screens are designed to first find compounds that bind to differentially 
expressed proteins. These compounds are then evaluated for the ability to modulate 
differentially expressed activity. Moreover, once initial candidate compounds are identified, 
variants can be further screened to better evaluate strucutre activity relationships. 

In a preferred embodiment, binding assays are done. In general, purified or 

30 isolated gene product is used; that is, the gene products of one or more differentially 

expressed nucleic acids are made. For example, antibodies are generated to the protein gene 
products, and standard immunoassays are run to determine the amount of protein present. 
Alternatively, cells comprising the angiogenesis proteins can be used in the assays. 
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Thus, in a preferred embodiment, the methods comprise combining an 
angiogenesis protein and a candidate compound, and determining the binding of the 
compound to the angiogenesis protein. Preferred embodiments utilize the human 
angiogenesis protein, although other mammalian proteins may also be used, for example for 
5 the development of animal models of human disease. In some embodiments, as outlined 
herein, variant or derivative angiogenesis proteins may be used. 

Generally, in a preferred embodiment of the methods herein, the angiogenesis 
protein or the candidate agent is non-difiusably bound to an insoluble support having isolated 
sample, receiving areas (e.g. a microliter plate, an array, etc.). The insoluble supports may be 

10 made of any composition to which the compositions can be bound, is readily separated from 
soluble material, and is otherwise compatible with the overall method of screening. The 
surface of such supports may be solid or porous and of any convenient shape. Examples of 
suitable insoluble supports include microtiter plates, arrays, membranes and beads. These are 
typically made of glass, plastic (e.g., polystyrene), polysaccharides, nylon or nitrocellulose, 

15 teflon™, etc. Microtiter plates and arrays are especially convenient because a large number 
of assays can be carried out simultaneously, using small amounts of reagents and samples. 
The particular manner of binding of the composition is not crucial so long as it is compatible 
with the reagents and overall methods of the invention, maintains the activity of the 
composition and is nondiffusable. Preferred methods of binding include the use of antibodies 

20 (which do not sterically block either the ligand binding site or activation sequence when the 
protein is bound to the support), direct binding to "sticky*' or ionic supports, chemical 
crosslinking, the synthesis of the protein or agent on the surface, etc. Following binding of 
the protein or agent, excess unbound material is removed by washing. The sample receiving 
areas may then be blocked through incubation with bovine serum albumin (BSA), casein or 

25 other innocuous protein or other moiety. 

In a preferred embodiment, the angiogenesis protein is bound to the support, 
and a test compound is added to the assay. Alternatively, the candidate agent is bound to the 
support and the angiogenesis protein is added. Novel binding agents include specific 
antibodies, non-natural binding agents identified in screens of chemical libraries, peptide 

30 analogs, etc. Of particular interest are screening assays for agents that have a low toxicity for 
human cells. A wide variety of assays may be used for this purpose, including labeled in 
vitro protein-protein binding assays, electrophoretic mobility shift assays, immunoassays for 
protein binding, functional assays (phosphorylation assays, etc.) and the like. 
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The determination of the binding of the test modulating compound to the 
angiogenesis protein may be done in a number of ways. In a preferred embodiment, the 
compound is labelled, and binding determined directly, e.g. y by attaching all or a portion of 
the angiogenesis protein to a solid support, adding a labelled candidate agent (eg., a 
5 fluorescent label), washing off excess reagent, and determining whether the label is present 
on the solid support Various blocking and washing steps may be utilized as appropriate. 

By "labeled" herein is meant that the compound is either directly or indirectly 
labeled with a label which provides a detectable signal, e.g radioisotope, fluorescers, 
enzyme, antibodies, particles such as magnetic particles, chemiluminescers, or specific 
10 binding molecules, etc. Specific binding molecules include pairs, such as biotin and 
streptavidin, digoxin and antidigoxin, etc. For the specific binding members, the 
complementary member would normally be labeled with a molecule which provides for 
detection, in accordance with known procedures, as outlined above. The label can directly or 
indirectly provide a detectable signal. 
15 In some embodiments, only one of the components is labeled, e.g 9 the 

proteins (or proteinaceous candidate compounds) can be labeled. Alternatively, more than 
one component can be labeled with different labels, e.g., 125 I for the proteinsand a fluorophor 
for the compound. Proximity reagents, e.g., quenching or energy transfer reagents are also 
useful. 

20 In one embodiment, the binding of the test compound is determined by 

competitive binding assay. The competitor is a binding moiety known to bind to the target 
molecule (i.e. an angiogenesis protein), such as an antibody, peptide, binding partner, ligand, 
etc. Under certain circumstances, there may be competitive binding between the compound 
and the binding moiety, with the binding moiety displacing the compound. In one 

25 embodiment, the test compound is labeled. Either the compound, or the competitor, or both, 
is added first to the protein for a time sufficient to allow binding, if present. Incubations may 
be performed at a temperature which facilitates optimal activity, typically between 4 and 
40°C. Incubation periods are typically optimized, e.g., to facilitate rapid high throughput 
screening. Typically between 0.1 and 1 hour will be sufficient. Excess reagent is generally 

30 removed or washed away. The second component is then added, and the presence or absence 
of the labeled component is followed, to indicate binding. 

In a preferred embodiment, the competitor is added first, followed by the test 
compound. Displacement of the competitor is an indication that the test compound is binding 
to the angiogenesis protein and thus is capable of binding to, and potentially modulating, the 
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activity of the angiogenesis protein. In this embodiment, either component can be labeled 
Thus, for example, if the competitor is labeled, the presence of label in the wash solution 
indicates displacement by the agent. Alternatively, if the test compound is labeled, the 
presence of the label on the support indicates displacement 
5 In an alternative embodiment, the test compound is added first, with 

incubation and washing, followed by the competitor. The absence of binding by the 
competitor may indicate that the test compound is bound to the angiogenesis protein with a 
higher affinity. Thus, if the test compound is labeled, the presence of the label on the 
support, coupled with a lack of competitor binding, may indicate that the test compound is 

10 capable of binding to the angiogenesis protein. 

In a preferred embodiment, the methods comprise differential screening to 
identity agents that are capable of modulating the activitity of the angiogenesis proteins. In 
this embodiment, the methods comprise combining an angiogenesis protein and a competitor 
in a first sample. A second sample comprises a test compound, an angiogenesis protein, and 

15 a competitor. The binding ofthe competitor is detennined for both samples, and a change, or 
difference in binding between the two samples indicates the presence of an agent capable of 
binding to the angiogenesis protein and potentially modulating its activity. That is, if the 
binding of the competitor is different in the second sample relative to the first sample, the 
agent is capable of binding to the angiogenesis protein. 

20 Alternatively, differential screening is used to identify drug candidates that 

bind to the native angiogenesis protein, but cannot bind to modified angiogenesis proteins. 
The structure of the angiogenesis protein may be modeled, and used in rational drug design to 
synthesize agents that interact with that site. Drug candidates that affect the activity of an 
angiogenesis protein are also identified by screening drugs for the ability to either enhance or 

25 reduce the activity of the protein. 

Positive controls and negative controls may be used in the assays. Preferably 
control and test samples are performed in at least triplicate to obtain statistically significant 
results. Incubation of all samples is for a time sufficient for the binding of the agent to the 
protein. Following incubation, samples are washed free of non-specifically bound material 

30 and the amount of bound, generally labeled agent determined For example, where a 

radiolabel is employed, the samples may be counted in a scintillation counter to determine the 
amount of bound compound. 

A variety of other reagents may be included in the screening assays. These 
include reagents like salts, neutral proteins, e.g. albumin, detergents, etc. which may be used 
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to facilitate optimal protein-protein binding and/or reduce non-specific or background 
interactions. Also reagents that otherwise improve the efficiency of the assay, such as 
protease inhibitors, nuclease inhibitors, anti-microbial agents, etc., may be used. The mixture 
of components may be added in an order that provides for the requisite binding. 
5 In a preferred embodiment, the invention provides methods for screening for a 

compound capable of modulating the activity of an angiogenesis protein. The methods 
comprise adding a test compound, as defined above, to a cell comprising angiogenesis 
proteins. Preferred cell types include almost any cell. The cells contain a recombinant 
nucleic acid that encodes an angiogenesis protein. In a preferred embodiment, a library of 
1 0 candidate agents are tested on a plurality of cells. 

In one aspect, the assays are evaluated in the presence or absence or previous 
or subsequent exposure of physiological signals, for example hormones, antibodies, peptides, 
antigens, cytokines, growth factors, action potentials, pharmacological agents including 
chemotherapeutics, radiation, carcinogenics, or other cells (i.e. cell-cell contacts). In another 
1 5 example, the determinations are determined at different stages of the cell cycle process. 

In this way, compounds that modulate angiogenesis agents are identified. 
Compounds with pharmacological activity are able to enhance or interfere with the activity of 
the angiogenesis protein. Once identified, similar structures are evaluated to identify critical 
structural feature of the compound 
20 In one embodiment, a method of inhibiting angiogenic cell division is 

provided. The method comprises administration of an angiogenesis inhibitor. In another 
embodiment, a method of inhibiting angiogenesis is provided The method comprises 
administration of an angiogenesis inhibitor. In a further embodiment, methods of treating 
cells or individuals with angiogenesis are provided The method comprises administration of 
25 an angiogenesis inhibitor. 

In one embodiment, an angiogenesis inhibitor is an antibody as discussed 
above. In another embodiment, the angiogenesis inhibitor is an antisense molecule. 

Polynucleotide modulators of angiogenesis 
30 Antisense Polynucleotides 

In certain embodiments, the activity of an angiogenesis-associated protein is 
downregulated, or entirely inhibited, by the use of antisense polynucleotide, Le. 9 a nucleic 
acid complementary to, and which can preferably hybridize specifically to, a coding mRNA 
nucleic acid sequence, e.g., an angiogenesis protein mRNA, or a subsequence thereof. 
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Binding of the antisense polynucleotide to the mRNA reduces the translation and/or stability 
of the mRNA. 

In the context of this invention, antisense polynucleotides can comprise 
naturally-occurring nucleotides, or synthetic species formed from naturally-occurring 
5 subunits or their close homologs. Antisense polynucleotides may also have altered sugar 
moieties or inter-sugar linkages. Exemplary among these are the phosphorothioate and other 
sulfur containing species which are known for use in the art. Analogs are comprehended by 
this invention so long as they function effectively to hybridize with the angiogenesis protein 
mRNA. See, e.g., Isis Pharmaceuticals, Carlsbad, CA; Sequitor, Inc., Natick, MA. 

1 0 Such antisense polynucleotides can readily be synthesized using recombinant 

means, or can be synthesized in vitro. Equipment for such synthesis is sold by several 
vendors, including Applied Biosystems. The preparation of other oligonucleotides such as 
phosphorothioates and alkylated derivatives is also well known to those of skill in the art. 
Antisense molecules as used herein include antisense or sense 

15 oligonucleotides. Sense oligonucleotides can, e.g., be employed to block trancription by 
binding to the anti-sense strand. The antisense and sense oligonucleotide comprise a single- 
stranded nucleic acid sequence (either RNA or DNA) capable of binding to target mRNA 
(sense) or DNA (antisense) sequences for angiogenesis molecules. A preferred antisense 
molecule is for an angiogenesis sequences in Tables 1-8 , or for a Ugand or activator thereof. 

20 Antisense or sense oligonucleotides, according to the present invention, comprise a fragment 
generally at least about 14 nucleotides, preferably from about 14 to 30 nucleotides. The 
ability to derive an antisense or a sense oligonucleotide, based upon a cDNA sequence 
encoding a given protein is described in, for example, Stein and Cohen (Cancer Res. 48:2659, 
1988) and van der Krol et al. (BioTechniques 6:958, 1988). 

25 

Ribozymes 

In addition to antisense polynucleotides, ribozymes can be used to target and 
inhibit transcription of angiogenesis-associated nucleotide sequences. A ribozyme is an RNA 
molecule that catalytically cleaves other RNA molecules. Different kinds of ribozymes have 
30 been described, including group I ribozymes, hammerhead ribozymes, hairpin ribozymes, 
RNase P, and axhead ribozymes (see, e.g. t Castanotto et al (1994) Adv. in Pharmacology 25: 
289-3 17 for a general review of the properties of different ribozymes). 

The general features of hairpin ribozymes are described, e.g. 9 in Hampel et al. 
(1990) Nucl Acids Res. IS: 299-304; Hampel et al. (1990) European Patent Publication No. 0 
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360 257; U.S. Patent No, 5,254,678. Methods of preparing are well known to those of skill in 
the art (see, e.g., Wong-Staal et al, WO 94/26877; Ojwang et al. (1993) Proc. Natl. Acad. 
Sci. USA 90: 6340-6344; Yamada et al (1994) Human Gene Therapy 1: 39-45; Leavitt et al 
(1995) Proc. Natl. Acad. Sci. USA 92: 699-703; Leavitt et al (1994) Human Gene Tlierapy 5: 
5 1151-120; and Yamada etal (1994) Virology 205: 121-126). 

Polynucleotide modulators of angiogenesis may be introduced into a cell 
containing the target nucleotide sequence by formation of a conjugate with a ligand binding 
molecule, as described in WO 91/04753. Suitable ligand binding molecules include, but are 
not limited to, cell surface receptors, growth factors, other cytokines, or other ligands that 

10 bind to cell surface receptors. Preferably, conjugation of the ligand binding molecule does 
not substantially interfere with the ability of the ligand binding molecule to bind to its 
corresponding molecule or receptor, or block entry of the sense or antisense oligonucleotide 
or its conjugated version into the cell. Alternatively, a polynucleotide modulator of 
angiogenesis may be introduced into a cell containing the target nucleic acid sequence, e.g., 

15 by formation of an polynucleotide-lipid complex, as described in WO 90/10448. It is 

understood that the use of antisense molecules or knock out and knock in models may also be 
used in screening assays as discussed above, in addition to methods of treatment. 

Thus, in one embodiment, methods of modulating angiogenesis in cells or 
organisms are provided. In one embodiment, the methods comprise administering to a cell an 

20 anti-angiogenesis antibody that reduces or eliminates the biological activity of an 

endogeneous angiogenesis protein- Alternatively, the methods comprise, administering to a 
cell or organism a recombinant nucleic acid encoding an angiogenesis protein. This may be 
accomplished in any number of ways. In a preferred embodiment, for example when the 
angiogenesis sequence is down-regulated in angiogenesis, such state may be reversed by 

25 increasing the amount of angiogenesis gene product in the cell. This can be accomplished, 
e.g.y by overexpressing the endogeneous angiogenesis gene or administering a gene encoding 
the angiogenesis sequence, using known gene-therapy techniques, for example. In a 
preferred embodiment, the gene therapy techniques include the incorporation of the 
exogenous gene using enhanced homologous recombination (EHR), for example as described 

30 in PCT/US93/03868, hereby incorporated by reference in its entireity. Alternatively, for 
example when the angiogenesis sequence is up-regulated in angiogenesis, the activity of the 
endogeneous angiogenesis gene is decreased, for example by the administration of a 
angiogenesis antisense nucleic acid or other inhibitor, such as RNAi. 
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In one embodiment, the angiogenesis eproteins of the present invention may 
be used to generate polyclonal and monoclonal antibodies to angiogenesis proteins. 
Similarly, the angiogenesis proteins can be coupled, using standard technology, to affinity 
chromatography columns. These columns may then be used to purify angiogenesis 
5 antibodies useful for production, diagnostic, or therapeutic purposes. In a preferred 

embodiment, the antibodies are generated to epitopes unique to a angiogenesis protein; that 
is, the antibodies show little or no cross-reactivity to other proteins. The angiogenesis 
antibodies may be coupled to standard affinity chromatography columns and used to purify 
angiogenesis proteins. The antibodies may also be used as blocking polypeptides, as outlined 
10 above, since they will specifically bind to the angiogenesis protein. 

Methods of identifying variant angiogenesis-associated sequences 

Without being bound by theory, expression of various angiogenesis sequences 
is correlated with angiogenesis. Accordingly, disorders based on mutant or variant 

1 5 angiogenesis genes may be determined. In one embodiment, the invention provides methods 
for identifying cells containing variant angiogenesis genes, e.g., determining all or part of the 
sequence of at least one endogeneous angiogenesis genes in a cell. This may be 
accomplished using any number of sequencing techniques. In a preferred embodiment, the 
invention provides methods of identifying the angiogenesis genotype of an individual, e.g., 

20 determining all or part of the sequence of at least one angiogenesis gene of the individual. 
This is generally done in at least one tissue of the individual, and may include the evaluation 
of a number of tissues or different samples of the same tissue. The method may include 
comparing the sequence of the sequenced angiogenesis gene to a known angiogenesis gene, 
i.e., a wild-type gene. 

25 The sequence of all or part of the angiogenesis gene can then be compared to 

the sequence of a known angiogenesis gene to determine if any differences exist. This can be 
done using any number of known homology programs, such as Bestfit, etc. In a preferred 
embodiment, the presence of a a difference in the sequence between the angiogenesis gene of 
the patient and the known angiogenesis gene correlates with a disease state or a propensity 

30 for a disease state, as outlined herein. 

In a preferred embodiment, the angiogenesis genes are used as probes to 
determine the number of copies of the angiogenesis gene in the genome. 

In another preferred embodiment, the angiogenesis genes are used as probes to 
determine the chromosomal localization of the angiogenesis genes. Information such as 

69 



WO 02/079492 



PCT/US02/04915 



chromosomal localization finds use in providing a diagnosis or prognosis in particular when 
chromosomal abnormalities such as translocations, and the like are identified in the 
angiogenesis gene locus. 

5 Administration of pharmaceutical and vaccine compositions 

In one embodiment, a therapeutically effective dose of an angiogenesis protein 
or modulator thereof, is administered to a patient. By "therapeutically effective dose" herein 
is meant a dose that produces effects for which it is administered. The exact dose will depend 
on the purpose of the treatment, and will be ascertainable by one skilled in the art using 

10 known techniques (e.g., Ansel et al 9 Pharmaceuitcal Dosage Forms and Drug Delivery, 
Lippincott, Williams & Wilkins Publishers, ISBN:0683305727; Lieberman (1992) 
Pharmaceutical Dosage Forms (vols. 1-3), Dekker, ISBN 0824770846, 082476918X, 
0824712692, 0824716981; Lloyd (1999) The Art, Science and Technology of Pharmaceutical 
Compounding, Amer. Pharmacutical Assn, ISBN 0917330889; and Pickar (1999) Dosage 

15 Calculations, Delmar Pub, ISBN 0766805042). As is known in the art, adjustments for 
angiogenesis degradation, systemic versus localized delivery, and rate of new protease 
synthesis, as well as the age, body weight, general health, sex, diet, time of administration, 
drug interaction and the severity of the condition may be necessary, and will be ascertainable 
with routine experimentation by those skilled in the art. 

20 A "patient" for the purposes of the present invention includes both humans and 

other animals, particularly mammals. Thus the methods are applicable to both human 
therapy and veterinary applications. In the preferred embodiment the patient is a mammal, 
preferably a primate, and in the most preferred embodiment the patient is human. 

The administration of the angiogenesis proteins and modulators thereof of the 

25 present invention can be done in a variety of ways as discussed above, including, but not 
limited to, orally, subcutaneously, intravenously, intranasally, transdermally, 
intraperitoneally, intramuscularly, intrapulmonary, vaginally, rectally, or intraocularly. In 
some instances, for example, in the treatment of wounds and inflammation, the angiogenesis 
proteins and modulators may be directly applied as a solution or spray. 

30 The pharmaceutical compositions of the present invention comprise an 

angiogenesis protein in a form suitable for administration to a patient In the preferred 
embodiment, the pharmaceutical compositions are in a water soluble form, such as being 
present as phaimaceutically acceptable salts, which is meant to include both acid and base 
addition salts. "Pharmaceutically acceptable acid addition salt" refers to those salts that retain 
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the biological effectiveness of the free bases and that are not biologically or otherwise 
undesirable, formed with inorganic acids such as hydrochloric acid, hydrobromic acid, 
sulfuric acid, nitric acid, phosphoric acid and the like, and organic acids such as acetic acid, 
propionic acid, glycolic acid, pyruvic acid, oxalic acid, maleic acid, malonic acid, succinic 
5 acid, fumaric acid, tartaric acid, citric acid, benzoic acid, cinnamic acid, mandelic acid, 

methanesulfonic acid, ethanesulfonic acid, p-toluenesulfonic acid, salicylic acid and the like. 
"Phannaceutically acceptable base addition salts" include those derived from inorganic bases 
such as sodium, potassium, lithium, ammonium, calcium, magnesium, iron, zinc, copper, 
manganese, aluminum salts and the like. Particularly preferred are the ammonium, 

10 potassium, sodium, calcium, and magnesium salts. Salts derived from phannaceutically 
acceptable organic non-toxic bases include salts of primary, secondary, and tertiary amines, 
substituted amines including naturally occurring substituted amines, cyclic amines and basic 
ion exchange resins, such as isopropylamine, trimethylamine, diethylamine, triethylamine, 
tripropylamine, and ethanolamine. 

1 5 The pharmaceutical compositions may also include one or more of the 

following: carrier proteins such as serum albumin; buffers; fillers such as microcrystalline 
cellulose, lactose, corn and other starches; binding agents; sweeteners and other flavoring 
agents; coloring agents; and polyethylene glycol. 

The pharmaceutical compositions can be administered in a variety of unit 

20 dosage forms depending upon the method of administration. For example, unit dosage forms 
suitable for oral administration include, but are not limited to, powder, tablets, pills, capsules 
and lozenges. It is recognized that angiogenesis protein modulators (e.g., antibodies, 
antisense constructs, ribozymes, small organic molecules, etc.) when administered orally, 
should be protected from digestion. This is typically accomplished either by complexing the 

25 molecule(s) with a composition to render it resistant to acidic and enzymatic hydrolysis, or by 
packaging the molecule(s) in an appropriately resistant carrier, such as a liposome or a 
protection barrier. Means of protecting agents from digestion are well known in the art. 

The compositions for administration will commonly comprise an angiogenesis 
protein modulator dissolved in a phannaceutically acceptable carrier, preferably an aqueous 

30 carrier. A variety of aqueous carriers can be used, e.g., buffered saline and the like. These 
solutions are sterile and generally free of undesirable matter. These compositions may be 
sterilized by conventional, well known sterilization techniques. The compositions may 
contain phannaceutically acceptable auxiliary substances as required to approximate 
physiological conditions such as pH adjusting and buffering agents, toxicity adjusting agents 
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and the like, for example, sodium acetate, sodium chloride, potassium chloride, calcium 
chloride, sodium lactate and the like. The concentration of active agent in these formulations 
can vary widely, and will be selected primarily based on fluid volumes, viscosities, body 
weight and the like in accordance with the particular mode of administration selected and the 
5 patient's needs {e.g. , Remington 's Pharmaceutical Science, 1 5th ed., Mack Publishing 
Company, Easton, Pennsylvania (1980) and Goodman and Gillman, The Pharmacologial 
Basis of Therapeutics, (Hardman, J.G, Limbird, L.E, Molinoff, P.B., Ruddon, R.W, and 
Gilman, A.G.,eds) TheMcGraw-Hill Companies, Inc., 1996). 

Thus, a typical pharmaceutical composition for intravenous administration 

10 would be about 0.1 to 10 mg per patient per day. Dosages from 0.1 up to about 100 mg per 
patient per day may be used, particularly when the drug is administered to a secluded site and 
not into the blood stream, such as into a body cavity or into a lumen of an organ. 
Substantially higher dosages are possible in topical administration. Actual methods for 
preparing parenterally administrable compositions will be known or apparent to those skilled 

15 in the art, e.g., Remington 's Pharmaceutical Science and Goodman and Gillman, The 
Pharmacologial Basis of Therapeutics, supra. 

The compositions containing modulators of angiogenesis proteins can be 
administered for therapeutic or prophylactic treatments. In therapeutic applications, 
compositions are administered to a patient suffering from a disease (e.g., a cancer) in an 

20 amount sufficient to cure or at least partially arrest the disease and its complications. An 

amount adequate to accomplish this is defined as a "therapeutically effective dose." Amounts 
effective for this use will depend upon the severity of the disease and the general state of the 
patient's health. Single or multiple administrations of the compositions may be administered 
depending on the dosage and frequency as required and tolerated by the patient. In any event, 

25 the composition should provide a sufficient quantity of the agents of this invention to 

effectively treat the patient. An amount of modulator that is capable of preventing or slowing 
the development of cancer in a mammal is referred to as a "prophylactically effective dose." 
The particular dose required for a prophylactic treatment will depend upon the medical 
condition and history of the mammal, the particular cancer being prevented, as well as other 

30 factors such as age, weight, gender, administration route, efficiency, etc. Such prophylactic 
treatments may be used, e.g., in a mammal who has previously had cancer to prevent a 
recurrence of the cancer, or in a mammal who is suspected of having a significant likelihood 
of developing cancer. 
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It will be appreciated that the present angiogenesis protein-modulating 
compounds can be administered alone or in combination with additional angiogenesis 
modulating compounds or with other therapeutic agent, e.g., other anti-cancer agents or 
treatments. 

5 In numerous embodiments, one or more nucleic acids, e.g., polynucleotides 

comprising nucleic acid sequences set forth in Tables 1-8 , such as antisense polynucleotides 
or ribozymes, will be introduced into cells, in vitro or in vivo. The present invention provides 
methods, reagents, vectors, and cells useful for expression of angiogenesis-associated 
polypeptides and nucleic acids using in vitro (cell-free), ex vivo or in vivo (cell or 

10 organism-based) recombinant expression systems. 

The particular procedure used to introduce the nucleic acids into a host cell for 
expression of a protein or nucleic acid is application specific. Many procedures for 
introducing foreign nucleotide sequences into host cells may be used. These include the use 
of calcium phosphate transfection, spheroplasts, electroporation, liposomes, microinjection, 

15 plasma vectors, viral vectors and any of the other well known methods for introducing cloned 
genomic DNA, cDNA, synthetic DNA or other foreign genetic material into a host cell (see, 
e.g., Berger and Kimmel, Guide to Molecular Cloning Techniques, Methods in Enzymology 
volume 152 Academic Press, Inc., San Diego, CA (Berger), F.M. Ausubel et al., eds., Current 
Protocols, a joint venture between Greene Publishing Associates, Inc. and John Wiley & 

20 Sons, Inc., (supplemented through 1999), and Sambrook et al, Molecular Cloning - A 
Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1989. 

In a preferred embodiment, angiogenesis proteins and modulators are 
administered as therapeutic agents, and can be formulated as outlined above. Similarly, 
.25 angiogenesis genes (including both the full-length sequence, partial sequences, or regulatory 
sequences of the angiogenesis coding regions) can be administered in a gene therapy 
application. These angiogenesis genes can include antisense applications, either as gene 
therapy (i.e. for incorporation into the genome) or as antisense compositions, as will be 
appreciated by those in the art. 

30 Angiogenesis polypeptides and polynucleotides can also be administered as 

vaccine compositions to stimulate HTL, CTL and antibody responses.. Such vaccine 
compositions can include, for example, lipidated peptides (e.g.,Vitiello, A et al., J. Clin. 
Invest. 95:341, 1995), peptide compositions encapsulated in poly(DL-lactide-co-glycolide) 
("PLG") microspheres (see, e.g., Eldridge, et al, Molec. Immunol. 28:287-294, 1991: Alonso 
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etal, Vaccine 12:299-306, 1994; Jones e/ a/., Vaccine 13:675-681, 1995), peptide 
compositions contained in immune stimulating complexes (ISCOMS) (see, e.g., Takahashi et 
aL 9 Nature 344:873-875, 1990; Hue/ al, Clin Exp Immunol 113:235-243, 1998), multiple 
antigen peptide systems (MAPs) {see e.g., Tarn, J. P., Proc. Natl. Acad. Sci. U.S.A. 85:5409- 
5 5413, 1988; Tarn, J.P., J. Immunol. Methods 196:17-32, 1996), peptides formulated as 
multivalent peptides; peptides for use in ballistic delivery systems, typically crystallized 
peptides, viral delivery vectors (Perkus, M. E. et al, In: Concepts in vaccine development, 
Kaufinann, S. H. E., ed., p. 379, 1996; Chakrabarti, S. et al, Nature 320:535, 1986; Hu, S. L. 
et al, Nature 320:537, 1986; Kieny, M.-P. et al, AIDS Bio/Technology 4:790, 1986; Top, F. 

10 H. et al, J. Infect. Dis. 124:148, 1971; Chanda, P. KL et al, Virology 175:535, 1990), 
particles of viral or synthetic origin (e.g., Kofler, N. et al, J. Immunol. Methods. 192:25, 
1996; Eldridge, J. H. et al, Sem. Hematol 30:16, 1993; Falo, L. D., Jr. etal, Nature Med. 
7:649, 1995), adjuvants (Warren, H. S., Vogel, F. R., and Chedid, L. A. Annu. Rev. Immunol 
4:369, 1986; Gupta, TLTLet al, Vaccine 1 1:293, 1993), liposomes (Reddy, R. et al, J. 

15 Immunol 148:1585, 1992; Rock, K. L., Immunol Today 17:131, 1996), or, naked or particle 
absorbed cDNA (Ulmer, J. B. et al, Science 259:1745, 1993; Robinson, H. L., Hunt, L. A., 
and Webster, R. G., Vaccine 1 1 :957, 1993; Shiver, J. W. et al, In: Concepts in vaccine 
development, Kaufinann, S. H. E., ed, p. 423, 1996; Cease, K. B., and Berzofsky, J. A., 
Annu. Rev. Immunol 12:923, 1994 and Eldridge, J. H. etal, Sem. Hematol 30:16, 1993). 

20 Toxin-targeted delivery technologies, also known as receptor mediated targeting, such as 
those of Avant Immunotherapeutics, Inc. (Needham, Massachusetts) may also be used. 

Vaccine compositions often include adjuvants. Many adjuvants contain a 
substance designed to protect the antigen from rapid catabolism, such as aluminum hydroxide 
or mineral oil, and a stimulator of immune responses, such as lipid A, Bortadella pertussis or 

25 Mycobacterium tuberculosis derived proteins. Certain adjuvants are commercially available 
as, for example, Freund's Incomplete Adjuvant and Complete Adjuvant (Difco Laboratories, 
Detroit, MI); Merck Adjuvant 65 (Merck and Company, Inc., Rahway, NJ); AS-2 
(SmithKline Beecham, Philadelphia, PA); aluminum salts such as aluminum hydroxide gel 
(alum) or aluminum phosphate; salts of calcium, iron or zinc; an insoluble suspension of 

30 acylated tyrosine; acylated sugars; cationically or anionically derivatized polysaccharides; 
polyphosphazenes; biodegradable microspheres; monophosphoryl lipid A and quil A. 
Cytokines, such as GM-CSF, interleukin-2, -7, -12, and other like growth factors, may also be 
used as adjuvants. 
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Vaccines can be administered as nucleic acid compositions wherein DNA or. 
RNA encoding one or more of the polypeptides, or a fragment thereof, is administered to a 
patient. This approach is described, for instance, in Wolff et al. 9 Science 247:1465 (1990) as 
well as US. Patent Nos. 5,580,859; 5,589,466; 5,804,566; 5,739,118; 5,736,524; 5,679,647; 
5 WO 98/04720; and in more detail below. Examples of DNA-based delivery technologies 
include "naked DNA", facilitated (bupivicaine, polymers, peptide-mediated) delivery, 
cationic lipid complexes, and particle-mediated ("gene gun") or pressure-mediated delivery 
(see, e.g., U.S. Patent No. 5,922,687). 

For therapeutic or prophylactic immunization purposes, the peptides of the 

10 invention can be expressed by viral or bacterial vectors. Examples of expression vectors 

include attenuated viral hosts, such as vaccinia or fowlpox. This approach involves the use of 
vaccinia virus, for example, as a vector to express nucleotide sequences that encode 
angiogenic polypeptides or polypeptide fragments. Upon introduction into a host, the 
recombinant vaccinia virus expresses the immunogenic peptide, and thereby elicits an 

1 5 immune response. Vaccinia vectors and methods useful in immunization protocols are 
described in, e.g., U.S. Patent No. 4,722,848. Another vector is BCG (Bacille Calmette 
Guerin). BCG vectors are described in Stover et al , Nature 351 :456-460 (1991). A wide 
variety of other vectors useful for therapeutic administration or immunization e.g. adeno and 
adeno-associated virus vectors, retroviral vectors, Salmonella typhi vectors, detoxified 

20 anthrax toxin vectors, and the like, will be apparent to those skilled in the art from the 

description herein (see, e.g., Shata et al (2000) Mol Med Today, 6: 66-71; Shedlock et al., / 
Leukoc Biol 68,:793-806, 2000; Hipp et al, In Vivo 14:571-85, 2000). 

Methods for the use of genes as DNA vaccines are well known, and include 
placing an angiogenesis gene or portion of an angiogenesis gene under the control of a 

25 regulatable promoter or a tissue-specific promoter for expression in an angiogenesis patient. 
The angiogenesis gene used for DNA vaccines can encode full-length angiogenesis proteins, 
but more preferably encodes portions of the angiogenesis proteins including peptides derived 
from the angiogenesis protein. In one embodiment, a patient is immunized with a DNA 
vaccine comprising a plurality of nucleotide sequences derived from an angiogenesis gene. 

30 For example, angiogenesis-associated genes or sequence encoding subfragments of an 
angiogenesis protein are introduced into expression vectors and tested for their 
immunogenicity in the context of Class I MHC and an ability to generate cytotoxic T cell 
responses. This procedure provides for production of cytotoxic T cell responses against cells 
which present antigen, including intracellular epitopes. 
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In a preferred embodiment, the DNA vaccines include a gene encoding an 
adjuvant molecule with the DNA vaccine. Such adjuvant molecules include cytokines that 
increase the immunogenic response to the angiogenesis polypeptide encoded by the DNA 
vaccine. Additional or alternative adjuvants are available. 
5 In another preferred embodiment angiogenesis genes find use in generating 

animal models of angiogenesis. When the angiogenesis gene identified is repressed or 
diminished in angiogenesic tissue, gene therapy technology, e.g., wherein antisense RNA 
directed to the angiogenesis gene will also diminish or repress expression of the gene. 
Animal models of angiogenesis find use in screening for modulators of an angiogenesis- 

10 associated sequence or modulators of angiogenesis. Similarly, transgenic animal technology 
including gene knockout technology, for example as a result of homologous recombination 
with an appropriate gene targeting vector, will result in the absence or increased expression 
of the angiogenesis protein. When desired, tissue-specific expression or knockout of the 
angiogenesis protein may be necessary. 

15 It is also possible that the angiogenesis protein is overexpressed in 

angiogenesis. As such, transgenic animals can be generated that overexpress the 
angiogenesis protein. Depending on the desired expression level, promoters of various 
strengths can be employed to express the transgene. Also, the number of copies of the 
integrated transgene can be determined and compared for a determination of the expression 

20 level of the transgene. Animals generated by such methods find use as animal models of 
angiogenesis and are additionally useful in screening for modulators to treat angiogenesis or 
to evaluate a therapeutic entity. 

Kits for Use in Diagnostic and/or Prognostic Applications 

25 For use in diagnostic, research, and therapeutic applications suggested above, 

kits are also provided by the invention. In the diagnostic and research applications such kits 
may include any or all of the following: assay reagents, buffers, angiogenesis-specific nucleic 
acids or antibodies, hybridization probes and/or primers, antisense polynucleotides, 
ribozymes, dominant negative angiogenesis polypeptides or polynucleotides, small molecules 

30 inhibitors of angiogenesis-associated sequences etc. A therapeutic product may include 
sterile saline or another pharmaceutically acceptable emulsion and suspension base. 

In addition, the kits may include instructional materials containing directions 
(i.e. t protocols) for the practice of the methods of this invention. While the instructional 
materials typically comprise written or printed materials they are not limited to such. Any 
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medium capable of storing such instructions and communicating them to an end user is 
contemplated by this invention. Such media include, but are not limited to electronic storage 
media (e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g., CD ROM), and the 
like. Such media may include addresses to internet sites that provide such instructional 
5 materials. 

The present invention also provides for kits for screening for modulators of 
angiogenesis-associated sequences. Such kits can be prepared from readily available 
materials and reagents. For example, such kits can comprise one or more of the following 
materials: an angiogenesis-associated polypeptide or polynucleotide, reaction tubes, and 

10 instructions for testing angiogenic-associated activity. Optionally, the kit contains 

biologically active angiogenesis protein. A wide variety of kits and components can be 
prepared according to the present invention, depending upon the intended user of the kit and 
the particular needs of the user. Diagnosis would typically involve evaluation of a plurality 
of genes or products. The genes will be selected based on correlations with important 

1 5 parameters in disease which may be identified in historical or outcome data. 

It is understood that the examples described above in no way serve to limit 
the true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
20 specification are herein incorporated by reference as if each individual publication or patent 
application were specifically and individually indicated to be incorporated by reference. 

EXAMPLES 

25 Example 1: Tissue Preparation. Labeling Chips, and Fingerprints 
Purify total RNA from tissue using TRIzol Reagent 

Homogenize tissue samples in 1ml of TRIzol per 50mg of tissue using a 
Polytron3100homogenizer. The generator/probe used depends upon the tissue size. A 
generator that is too large for the amount of tissue to be homogenized will cause a loss of 

30 sample and lower RNA yield. TRIzol is added directly to frozen tissue, which is then 

homogenize. Following homogenization, insoluble material is removed by centrifugation at 
7500 x g for 15 min in a Sorvall superspeed or 12,000 x g for 10 min. in an Eppendorf 
centrifuge at 4°C. The clear homogenate is transferred to a new tube for use. The samples 
may be frozen now at -60° to -70°C (and kept for at least one month). The homogenate is 
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mixed with 0.2ml of chloroform per 1ml of TRIzol reagent used in the original 
homogenization and incubated at room temp, for 2-3 minutes. The aqueous phase is then 
separated by centrifugation and transferred to a fresh tube and the RNA precipitated using 
isopropyl alcohol. The pellet is isolated by centrifugation, washed, air-dried, resuspended in 
5 an appropriate volume of DEPC H 2 0, and the absorbance measured. 

Purification of poly A+ mRNA from total RNA is performed as follows. Heat 
an oligotex suspension to 37°C and mixing immediately before adding to RNA. The 
Elution Buffer is heated at 70°C. Warm up 2 x Binding Buffer at 65°C if there is precipitate 
in the buffer. Mix total RNA with DEPC-treated water, 2 x Binding Buffer, and Oligotex 

10 according to Table 2 on page 16 of the Oligotex Handbook. Incubate for 3 minutes at 65°C. 
Incubate for 10 minutes at room temperature. Centrifuge for 2 minutes at 14,000 to 18,000 g. 
Remove supernatant without disturbing Oligotex pellet. A little bit of solution can be left 
behind to reduce the loss of Oligotex. Gently resuspend in Wash Buffer OW2 and pipet onto 
spin column. Centrifuge the spin column at Ml speed for 1 minute. Transfer spin column to 

1 5 a new collection tube and gently resuspend in Wash Buffer OW2 and centrifuge as describe 
herein. Transfer spin column to a new tube and elute with 20 to 100 ul of preheated (70oC) 
Elution Buffer. Gently resuspend Oligotex resin by pipetting up and down. Centrifuge as 
above. Repeat elution with fresh elution buffer or use first eluate to keep the elution volume 
low. Read absorbance, using diluted Elution Buffer as the blank. Before proceeding with 

20 cDNA synthesis, precipitate the mRNA as follows: add 0.4 vol. of 7.5 M NH40Ac + 2.5 vol. 
of cold 100% ethanol. Precipitate at -20oC 1 hour to overnight (or 20-30 min. at -70oC). 
Centrifuge at 14,000-16,000 x g for 30 minutes at 4oC. Wash pellet with 0.5ml of 
80%ethanol (-20oC) then centrifuge at 14,000-16,000 x g for 5 minutes at room temperature.. 
Repeat 80% ethanol wash. Air dry the ethanol from the pellet in the hood.. Suspend pellet in 

25 DEPC H 2 0 at lug/ul concentration. 

To further Clean up total RNA using Qiagen's RNeasy kit, add no more than 
lOOug to an RNeasy column. Adjust sample to a volume of lOOul with RNase-free water. 
Add 350ul Buffer RLT then 250ul ethanol (100%) to the sample. Mix by pipetting (do not 
centrifuge) then apply sample to an RNeasy mini spin column. Centrifuge for 15 sec at 

30 >10,000rpm. Transfer column to a new 2-ml collection tube. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowthrough. Add 500ul Buffer RPE and 
centrifuge for 15 sec at >10,000rpm. Discard flowthrough then centrifuge for 2 min at 
maximum speed to dry column membrane. Transfer column to a new 1 .5-ml collection tube 
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and apply 30-5 Oul of RNase-free water directly onto column membrane. Centrifuge 1 min at 
>10,000rpm. Repeat elution. and read absoibance. 

cDNA synthesis using Gibco's "Superscript Choice System for cDNA Synthesis" kit 
5 First Strand cDNA synthesis is performed as follows. Use 5ug of total RNA 

or lug of polyA+ mRNA as starting material. For total RNA, use 2ul of Superscript RT. For 
polyA+ mRNA, use lul of Superscript RT. Final volume of first strand synthesis mix is 
20ul. RNA must be in a volume no greater than lOul. Incubate RNA with lul of lOOpmol 
T7-T24oligoforl0minat70C. On ice, add 7 ul of: 4ul5Xlst Strand Buffer, 2ul of 0.1M 

10 DTT, and 1 ul of lOmM dNTP mix. Incubate at 37C for 2 min then add Superscript RT. 
Incubate at 37C for 1 hour. 

For the second strand synthesis, place 1st strand reactions on ice and add: 9 lul 
DEPC H 2 0; 30ul 5X 2nd Strand Buffer; 3ul lOmM dNTP mix; lul lOU/ul E.coli DNA 
Ligase; 4ul lOU/ul E.coli DNA Polymerase; and lul 2U/ul RNase H. Mix and incubate 2 

15 hours at 16C. Add 2ul T4 DNA Polymerase. Incubate 5 min at 16C. Add lOul of 0.5M 
EDTA. A further clean-up of DNA is performed using phenol: chloroform:isoamyl Alcohol 
(25:24:1) purification. 

In vitro Transcription (TVT) and labeling with biotin is performed as follows: 
Pipet 1 .5ul of cDNA into a thin-wall PGR tube. Make NTP labeling mix by combining 2ul T7 

20 lOxATP (75mM) (Ambion); 2ul T7 lOxGTP (75mM) (Ambion); 1 .5ul T7 lOxCTP (75mM) 
(Ambion); 1.5ul T7 lOxUTP (75mM) (Ambion); 3.75ul lOmM Bio-1 1-UTP (Boehringer- 
Mannheim/Roche or Enzo); 3.75ul lOmM Bio-16-CTP (Enzo); 2ul lOx T7 transcription 
buffer (Ambion); and 2ul lOx T7 enzyme mix (Ambion). The final volume is 20uL Incubate 
6 hours at 37°C in a PCR machine. The RNA can be furthered cleaned. 

25 Fragmentation is performed as follows. 15 ug of labeled RNA is usually 

fragmented Try to minimize the fragmentation reaction volume; a 10 ul volume is 
recommended but 20 ul is all right. Do not go higher than 20 ul because the magnesium in 
the fragmentation buffer contributes to precipitation in the hybridization buffer. Fragment 
RNA by incubation at 94 C for 35 minutes in 1 x Fragmentation buffer (5 x Fragmentation 

30 buffer is 200 mM Tris-acetate, pH 8.1; 500 mM KOAc; 150 mM MgOAc). The labeled 
RNA transcript can be analyzed before and after fragmentation. Samples can be heated to 
65°C for 15 minutes and electrophoresed on 1% agarose/TBE gels to get an approximate idea 
of the transcript size range 
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For hybridization, 200 ul (10ug cKNA) of a hybridization mix is put on the 
chip. If multiple hybridizations are to be done (such as cycling through a 5 chip set), then it 
is recommended that an initial hybridization mix of 300 ul or more be made. The 
hybridization mix is: fragment labeled RNA (50ng/ul final cone); 50 pM 948-b control 
5 oligo; 1.5pMBioB;5pMBioC;25pMBioD; 100 pM CRE; O.lmg/ml herring sperm DNA; 
0.5mg/ml acetylated BSA; and 300 ul with lxMES hyb buffer. 

Labeling is performed as follows: The hybridization reaction includes non- 
biotinylated IVT (purified by RNeasy columns); IVT antisense RNA 4 ng:jil; random 
Hexamers (1 ng/jil) 4 |xl and water to 14 ul. The reaciton is incubated at 70°C, 10 min. 

10 Reverse transcriptionis performed in the following reaction: 5X First Strand (BRL) buffer, 6 
fil; 0. 1 M DTT, 3 fil; SOX dNTP mix, 0.6 fil; H 2 0, 2.4 ul; Cy3 or Cy5 dUTP (ImM), 3 yl; SS 
RT II (BRL), 1 jxl in a final volume of 16 \il Add to hybridization reaction. Incubate 30 
min., 42°C. Add 1 \il SSU and incubate another hour. Put on ice. 50X dNTP mix (25mM of 
cold dATP, dCTP, and dGTP, lOmM ofdTTP: 25 jil each of lOOmM dATP, dCTP, and 

15 dGTP; 10 ul of lOOmM dTTP to 15 \il H20. dNTPs from Pharmacia) 

RNA degradation is performed as follows. Add 86 ul H20, 1.5 |il 1M NaOH/ 
2mM EDTA and incubate at 65°C, 10 min.. For U-Con 30, 500 jul TE/sample spin at 7000g 
for 1 0 min, save flow through for purification. For Qiagen purification, suspend u-con 
recovered material in 500jil buffer PB and proceed using Qiagen protocol. For DNAse 

20 digestion, add 1 ul of 1/100 dil of DNAse/30ul Rx and incubate at 37°C for 15 min. Incubate 
at 5 min 95°C to denature the DNAse/ 

For sample preparation, add Cot-1 DNA, 10 fil; 50X dNTPs, 1 pi; 20X SSC, 
2.3 jil; Na pyro phosphate, 7.5 lOmg/ml Herring sperm DNA; lul of 1/10 dilution to 21.8 
final vol. Dry in speed vac. Resuspend in 15 jil H20. Add 0.38 jil 10% SDS. Heat 95°C, 2 

25 min and slow cool at room temp, for 20 min. Put on slide and hybridize overnight at 64°C. 
Washing after the hybridization: 3X SSC/0.03% SDS: 2 min., 37.5 mis 20X SSC+0.75mls 
10% SDS in 250mls H20; IX SSC: 5 min., 12.5 mis 20X SSC in 250mls H20; 0.2X SSC: 5 
min., 2.5 mis 20X SSC in 250mls H20. Dry slides and scan at appropiate PMT's and 
channels. 

30 

Example 2. A model of angiogenesis is used to determine expression in angiogenesis 

In the model of angiogenesis used to determine expression of angiogenesis- 
associated sequences, human umbilical vein endothelial cells (HUVEQ were obtained, e.g. 9 
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as passage 1 (pi) frozen cells from Cascade Biologies (Oregon) and grown in maintenance 
medium: Medium 199 (Life Technologies) supplemented with 20% pooled human serum, 
100 mg/ml heparin and 75 mg/ml endothelial cell growth supplements (Sigma) and 
gentamicin (Life Technologies). An in vitro cell system model was used in which 2x1 0 5 
HUVECs were cultured in 0.5 ml 3 mgs/ml plasminogen-depleted fibrinogen (Calbiochem, 
San Diego, CA) that was polymerized by the addition of 1 unit of maintenance medium 
supplemented with 100 ng/ml VEGF and HGF and 10 ng/ml TGF-a (R&D Systems, 
Minneapolis,MN) added (growth medium). The growth medium was replaced every 2 days. 
Samples for RNA were collected, e.g., at 0, 2, 6, 15, 24, 48, and 96 hours of culture. The 
fibrin clots were placed in Trizol (Life Technologies) and disrupted using a Tissuemizer. 
Thereafter standard procedures were used for extracting the RNA (e.g. 9 Example 1). 

Angiogenesis associated sequences thus identified are shown in Tables 1-8 . 
As indicated, some of the Accession numbers include expression sequence tags (ESTs). 
Thus, in one embodiment herein, genes within an expression profile, also termed expression 
profile genes, include ESTs and are not necessarily full length. 
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TABLE 1: 



Ptey: Unique Eos probeset identifier number 

Accession: Accession number used (or previous patent fBngs 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



Ptey 


Accession 


ExAccn 


UnigenelD 


134404 


AB00O450 


AB000450 


Hs.82771 


121443 


AB002380 


AF180681 


Hs.6582 


100082 


AB0O3103 


AA130080 


Hs.4295 


132817 


AB004884 


N27852 


Hs.57553 


130150 


AF000573_ma1 


BE094848 


Hs.15113 


100104 


AF008937 


AF008937 


Hs.102178 


130839 


AF009301 


AB011169 


Hs.20141 


427064 


AF009368 


AF029674 


Hs.173422 


100113 


D00591 


NM.001269 


Hs.84746 


133980 


D00760 


AA294921 


Hs.250811 


100129 


D11139 


AA469369 


Hs.5831 


100154 


D14657 


H60720 


Hs.81892 


100169 


D14878 


AL037228 


Hs.82043 


101956 


D17716 


NM.002410 


Hs.121502 


100190 


D21090 


M91401 


Hs.178658 


134742 


D26135 


NM.001346 


Hs.89462 


100211 


D26528 


D26528 


Hs.123058 


100238 


D30742 


L24959 


Hs.348 


130283 


D31762 


NM 012288 


Hs.153954 


134237 


D31765 


D31765 


Hs.170114 


100248 


D31888 


NM.015156 


Hs.78398 


100256 


D38128 


D25418 


Hs33 


100262 


D38500 


D38500 


Hs^78468 


134329 


D38551 


N92036 


Hs.81848 


100281 


D42087 


AF091035 


Hs.184627 


100294 


D49396 


AA331881 


Hs.75454 


100327 


D55640 


D55640 




100335 


D63391 


AW247529 


Hs.6793 


134495 


D63477 


D63477 


Hs.84087 


100338 


D63483 


D86864 


Hs.57735 


135152 


D64015 


M96954 


Hs.182741 


134269 


D79990 


NM.014737 


Hs.80905 


100372 


D79997 


NM.014791 


Hs.184339 


134304 


D80010 


BE613486 


Hs.81412 


100394 


D84276 


D84284 


Hs.66052 


100405 


D86425 


AW291587 


Hs.82733 


100418 


D86978 


D86978 


Hs.84790 


133154 


D87012 


D87012 


Hs.194685 


134347 


D87075 


AF164142 


Hs.82042 


128653 


D87432 


D87432 


Hs.10315 


100438 


D87448 


AA013051 


Hs.91417 


134593 


D87845 


NMJJ00437 


Hs^34392 


100481 


HG109WfT1098 X70377 


Hs.121489 


100552 


HG2167-HT2237 AA019521 


Hs301946 


100591 


HG2415+TT2511 NMJXM091 


Hs.231444 


cds 

100652 


HG2825-HT2949 BE613608 


Hs.142653 


100662 


HG2887-HT303U 


AI368680 


100899 


HG4660-HT5073 AL039123 


Hs.103042 


100905 


HG4704^T5146 L12260 


Hs.172816 


100945 


HG8B4-HT884 


AF002225 


HS.1B0686 


100950 


HG919-HT919 


AF128542 


Hs.166846 


100964 


J00212_f 


J00212 




135407 


jcwoa 


J04029 


Hs.99935 


130149 


J04031 


AW067805 


Hs.172665 


131877 


J04088 


J04088 


Hs.156346 


101016 


J04543 


J04543 


Hs.78637 


134786 


L06139 


T2961B 


Hs.89640 


134100 


L07540 


AA460085 


Hs.171075 


134078 


L08895 


L08895 


Hs.78995 


101132 


L11239 


L11239 


HsJ6993 


134849 


L11353 


BE409525 


Hs£02 


106432 


L13773 


AK000310 


Hs.17138 



vacdnia related kinase 2 

Rho guanine exchange factor (GEF) 12 

proteasome (prosome, macropain) 26S subunit non-ATPase, 12 

tousled-like kinase 2 

homogentisate 1,2-dioxygenase (homogentisate oxidase) 
syntaxin 16 

similar to S. cerevfeiae SSM4 
K1AA1605 protefri 
chromosome condensaSon 1 

\KaJ simon leukemia viral oncogene homotog B (ras related; GTP binding protein) 
tissue Inhibitor of metailoproteinase 1 (erythroid potentiating activity, ooBagenase inhibitor) 
KIAA0101 gene product 
D123 gene product 

mannosyl (aJpha-1,6-)-glycoprotein bete-1,6-N-acety4tucosamin^ 
RAD23 (S. cerevisiae) homotog B 
diacytgiycero) kinase, gamma (90kD) 

DEAD/H (Asp-Gtu-Ala-Asp/His) box polypeptide 7 (RNA heDcase, 52kD) 
caidum/calrnoduiirHlepenaent protein kinase IV 
TRAM-Gke protein 
K1AA0061 protein 
WAA0071 protein 

prostaglandin 12 (prostacyclin) receptor (EP) 
postmitotic segregation increased 2-Eke 4 
RAD21 (S. pombe) hornolog 
K1AA01 18 protein 
peroxzredoxin 3 

gb:Human monocyte PABL (pseudoautosornai boundary-Gke sequence) mRNA, clone Mo2. 
platelet-acfivating factor acetylhydrofese, isoform lb, gamma subunit (29kO) 
K1AA0143 protein 
acetyl LDL receptor; SREC 

TTA1 cytotoxic granule-associated RNA-binding protein-lfte 1 
Ras association (RalGDS/AF-6) domain family 2 
KlAA0175gene product 
Dpin 1 

CD38 antigen (p45) 
nidogen2 
K1AA0225 protein 
topotsomerase (DNA) 111 beta 

solute carrier family 23 (nucleobase transporters), member 1 

solute carrier family 7 (cafionic amino acid transporter, y+ system), member 6 

topo'somerase (DNA) II binding protein 

platelet-activating factor acetythydroiase 2 (40kD) 

cystafinD 



Homo sapiens, Similar to hypothetical protein PR01722, done MGC: 15692, mRNA, complete 



Hs.816 SRY (sex determining region Y)-box 2 

rnicrotubule-assodated protein 1B 

neureguDnl 

ubiquitin protein Bgase E3A (human papilloma virus E6-associated protein, Angelman syndrome) 

polymerase (DNA directed), epsflon 

Empirically selected from AFFX single probeset 

keratin 10 (epioermolyfic hyperkeratosis; keratosis palmare et pbntans) 

methyienetetrahydrofolate dehydrogenase (NADP+ dependent), rneftenyaetrahydrofolate 

topoisomerase (DNA) II alpha (170kO) 

annexln A7 

TEK tyrosine kinase, endothelial (venous malformations, mulfipte cutaneous and mucosal) 
replication factor C (activator 1) 5 (365kD) 

MADS box transcripfion enhancer factor 2, polypeptide C (myocyte enhancer factor 2C) 
gastrutafon brain homeo box 1 
neurofibromin 2 (bilateral acoustic neuroma) 
hypothetical protein RJ20303 
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101152 


L13800 


A1984625 


Ks.9884 


135397 


L14922 


L14922 


Hs 166563 

1 1 www 


131687 


L15189 


BE297535 


Hs.3069 


101168 


L15388 


NNL005308 


H0 11 56g 


421155 


L16B95 


H87879 


Hs. 102267 


101226 


L27476 


AF083892 


Hs.75608 


133975 


L27624 


C18356 


Hs.295944 


134739 


L32976 


NMJJ02419 


Hs.89449 


130155 


L33404 


AA101043 


He 151254 

f TO, 1 «J IMH 


440538 


L35263 


W76332 


Hs 79107 

ildtl 9 Ivl 


132813 


L37347 


BE313S25 




101294 


L40371 


AF1R841S 


He 116784 
no. I loi 0*t 


101300 


L40391 




Ue 74137 


101310 


L41607 


L41607 


Hs.934 


130344 


L77566 


AW250122 


Hs 154879 

no. 9 


embryonic lethal 






10 i 381 


M 13928 


AW675039 


Hs.1227 


101668 


Ml 401 6 


AW005903 


Hs.78601 


133780 


M14219 


AA557660 


Hs.76152 


101396 


M15796 


BE267931 


HS.7B996 


101447 


M21305 


M21305 




101458 


M22092 


M22092 




101470 


M22698 


MM 000646 


Hs.1846 


134604 


M22995 


MM 002884 


Hs.865 

na.OW 


101478 


M23379 


mm mTflon 


Hs758 
no./ s/o 


406698 


M24364 


ALhXKJO 


Ha 73031 

no./ «JO\3 1 


133519 


M24400 


AW5fl30B2 


no./ h»ju< 


131185 


M25753 


RF?R0fi74 

OLXQlAJf H 




134116 


M27691 


RA4K94 


He 79194 
no./ 0 


133999 


M28213 


AA535244 


Hs. 76305 


130174 


M29550 


M2&51 


Hs.151531 


129963 


M29971 




Hs.1384 


132983 


M30269 


M30769 

IVWU£09 


Hq 62041 


133900 


M31158 


M31158 


Hs 77439 
no. / 1 twa , 


101543 


M31166 


M31166 




101545 


M31210 




Hq 1*5421 f) 
. no. 1 o*rfc 1 \j 


101620 


M55420 


S55271 


Hs.247930 


134691 


M59979 


A\AftA9Qft7 


H*ftft474 


133595 


M62610 


MrVjaOi/ O 


He 75173 


130425 


M63838 


AA943383 


He 15553D 
no. 1 <j^jou 


101700 


M64710 


rwnw 


He 94791 R 
no^H/ 9 10 


101714 


M68674 


IVIOOO/ *T 


ns^ 1 1 jo/ 


134246 


M74524 


U&0HO9 


Hs.80612 


101760 


M80254 


M60254 


Hs.173125 
no. 1 1 0 


133948 


M81780_cds3 


X59960 


Hs.77813 


101791 


M83822 


M83822 


Hs.62354 


101612 


M86934 


8E439894 


He 78991 


101813 


M87338 




He 139226 
no. i*J3it.u 


133396 


M96326jna1 




Hs.72885 


135152 


M96954 




He 182741 
no. ioz./*t 1 


129026 


M98833 


AL120297 * 

rWm 1 £.V<.0 1 


Hs 108043 


101901 


S65793 




Me VIA 
ns.ouo 


134831 


S72370 


AAA«vld7Q 


Me ROfton 
no.uoOoU 


134039 


S78569 




He 7flfi7? 
no./ 00/ c 


134395 


S79873 


rvvK/OOOo 


He R262 


101975 


S83325 


nrWJluf If 


Lfcj 9ft3fifi4 
no^oooo** 


101977 


S83364 


rti 1 ia 10 


Hs 184062 
no. io*tv9£ 


101978 


S83365 




He 5RQQ 


Interacting factor) 






101998 


U01212 


1101919 

UU lilt 


He 248153 


102003 


U01922 


uu 1 9a 


He 125555 
no. ibMW 


102007 


U02556 


U02556 


Hs.75307 


102009 


U02680 


BF245149 


Hs 82643 


416658 


U03272 


U03272 


Hs.79432 

no. ■ vtut 


132951 


U04209 


AWR21182 

r\V»Ot i lOuC 


Hs 61418 

nO.U 1™ IU 


135389 


U05237 




He Q9872 


102048 


U07225 


U07225 


Hs339 


130145 


U07620 


U34820 


Hs.151051 


303153 


U09759 


U09759 


Hs^46857 


420269 


UQ9820 


U72937 


H&96264 


102095 


U11313 


U11313 


Hs.75760 


102123 


U14518 


NMJJ018Q9 


Hs.1594 


102126 


U14575 


AW950870 


Hs.78961 


102133 


U15173 


AU076845 


Hs.155596 


102139 


U15932 


NM.004419 


HS2128 


102162 


U18291 


AA450274 


Hs.1592 



spindte pole body protein 
replication factor C (activator 1) 1 (145kD) 
heat shock 70kD protein 9B (mortaDn-2) 
G protein-coupled receptor kinase 5 
lysyl oxidase 

fight junction protein 2 (zona ocdudens 2) 
tissue factor pathway Inhibitor 2 
rrutogen-activated protein kinase kinase kinase 1 1 
kaffikrein 7 (chyrnotrypfic, stratum comeum) 
mfiogerhactivated protein kinase 14 

sotute earner family 11 (proton-coupled divalent metal ton transporters), member 2 

thyroid hormone receptor interactor 4 

trartsmernbrane trafficking protein 

glucosaminyt (N -acetyl) transferase 2, branching enzyme 

DiGeorge syndrome critical region gene DGSI; likely orthoiog of mouse expressed sequence 2 

amlnotevullnate, delta-, dehydratase 
uroporphyrinogen decarboxylase 
decorin 

proliferating cell nuclear antigen 

gbWuman alpha sateflite and sateffite 3 junction DNA sequence. 

gb:Human neural ceil adhesion molecule (N-CAM) gene, exon SEC and partial cds. 

tumor protein p53 (U-Fraumeni syndrome) 

RAP1A, member of RAS oncogene family 

RAS p21 protein activator (GTPase activating protein) 1 

major histocompafeility complex, class II, DQ beta 1 

chymotrypsinogen B1 

cyctin B1 

cAMP responsive element binding protein 1 
RAB2, member RAS oncogene family 

protein phosphatase 3 (formerly 2B), catalytic subunlt, beta isoform (calcineunn A beta) 
OS-memylg^janJne-DNA methyitransferase 
nidogen (enactin) 

protein kinase, cAlvlP-dependent regulatory, type II, beta 
pentaxin-related gene, rapidly induced by IL-1 beta 
endothelial differentiation, sphingolipid G-proteirKxmpled receptor, 1 
Epsiton , igE 

prostaglandin^doperoxide synthase 1 (prostaglandin G/H synthase and cyciooxygenase) 

transcription factor 6-Uke 1 (mitochondrial transcription factor 1-Gke) 

interferon, gamma -inducible protein 16 

natriuretic peptide precursor C 

phospholipase A2, group IVA (cytosolic calcium-dependent) 

ubiquitin-conjugating enzyme E2A (RAD6 homolog) 

peptldytproiyl isornerase F (cydophiBn F) 

sphingornyefin phosphodiesterase 1, add lysosomal (add sphingomyelinase) 
cell division cycle 44ike 

DNA segment numerous copies, expressed probes (GS1 gene) 

replication factor C (activator 1) 2 (40kD) 

azuroddin 1 (cafionic antimicrobial protein 37) 

T1A1 cytotoxic granule-associated RNA-binding protein-like 1 

Friend leukemia virus integration 1 

arrestin 3, retinal (X-anestin) 

pyruvate carboxylase 

taminin < aipha4 



aspartate befa-hydroxyfase 
putative Rab5-interacting protein 

putative transrnernbrane protein; homolog of yeast Goigi membrane protein Yiflp (Ytp1 p- 
oifactory marker protein 

translocase of inner mitochondrial membrane 8 (yeast) homolog A 
t-complex-as3ocfated-tesfe€xpressed 14Ike 
protein tyrosine kinase 9 

fibrin 2 (congenita) contractural arachnooactyty) 
mxrofibriilar-assodated protein 1 



purinergic receptor P2Y, G-protein coupled, 2 
mflogen-activated protein kinase 10 
mtog en-activated protein kinase 9 

alpha thalassemia/mental retardation syndrome X-Dnked (RAD54 (S. cerevisiae) homolog)' 
sterol carrier protein 2 
centromere protein A (1 7kD) 
protein phosphatase 1, 

BO^aoenovtrus E1B 19kD*teracting protein 2 
dual specificity phosphatase 5 
CDC 16 (ceil division cycle 16, S. cerevisiae, homolog) 
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1Q2164 U18300 


NMJ)Q0107 


Hs.77602 


427653 U18383 


AA 159001 


Hs. 180059 


131817 U20538 


U 20535 


Hs.3280 


102200 U21551 


AA232362 


Hs 157205 


102210 023028 


BE619413 


Hs.2437 


102214 U23752 


U23752 


Hs.32964 


132811 U25435 


U25435 


Hs.57419 


131319 U25997 


NM.003155 


Hs.25590 


102256 028251_cds2 


U28251 


Hs.53237 


132316 1128831^ 


U28831 


Hs.44566 


102269 U30245 


U30245 




Gxon 1. 






134365 U32315 


AA568906 


Hs.82240 


102293 032439 


AF090116 


Hs.79348 


102298 U32849 


AA382169 


Hs.54483 


102325 035139 


AI815867 




302344 U36764 


BE303044 


Hs.192023 


102361 1139400 


AA223616 


Hs.75859 


102367 U39657 


U39656 


Hs 118825 


102388 U41344 


AA362907 


Hs.76494 


102394 U41766 


NM 003816 


Hs.2442 


129829 U41813 


AF010258 

I** V IV4VV 


Hs.127428 


102251 U41815 


NNL0G4398 


Hs.41706 


102409 U432B6 


BE300330 


Hs 118725 


133746 U44378 


AW410035 


Hs.75862 


102423 U44754 


Z47542 


Hs.179312 


132828 U47011 cdsl 


AB014615 


Hs.57710 


130441 U47077 


U63630 


Hs.1 55637 


102450 U48251 


U48251 


Hs.75871 


129350 U50535 


U50535 


Hs.110630 


102534 U56833 


U96759 


Hs.198307 


130457 U58091 


AB014595 


Hs.155976 


135065 U58837 


AA019401 


Hs.93909 


102560 U59289 


R97457 


Hs.63984 


102567 U59863 


U63830 


Hs.146847 


134305 U67122 


U61397 


Hs.81424 


102638 U67319 


U67319 


Hs.9216 


132736 U68019 


AW081883 


Hs.288261 


sapiens mad protein homolog (hMAD-3) mRNA 


133070 U69611 


U92649 


Hs.64311 


102663 U70322 


NWL002270 


Hs.168075 


134660 073524 


073524 


Hs.87465 


102735 U79267 


AF111106 


Hs.3382 


102741 079291 


AW959829 


Hs.83572 


101175 U82671jxfe2 


U82671 


Hs.36980 


132164 U84573 


AI752235 


Hs.41270 


102823 U90914 


D85390 


Hs.5057 


102826 U91316 


NM_007274 


Hs.8679 


102831 U91932 


AA262170 


Hs.80917 


102846 U96131 


BE264974 


Hs.6566 


129777 U97018 


U97018 


Hs.12451 


134161 U97188 


AA534543 


Hs.79440 


134854 V00503 


J03464 


Hs.179573 


302363 X04327 


AW163799 


Hs.198365 


133708 X06389 


AI018666 


Hs.75667 


125701 X07496 


T72104 


Hs.93194 


102915 X07820 


X07820 


HS2258 


134656 X14787 


AI750878 


Hs.87409 


413858 X15525_ma1 


NM.001610 


Hs.75589 


102968 X16396 


AU076611 


Hs.154672 


cydohydrofase 
102971 X16609 


X16609 


Hs.183805 


134037 X535B6_ma1 


A1808780 


Hs.227730 


103023 X53793 


AW500470 


Hs.117950 


103037 X54936 


BE018302 


Hs.2894 


130282 X55740 


BE245380 


Hs.153952 


134542 X57025 


M14156 


Hs.85112 


128568 X60673_ma1 


H12912 


Hs374691 


103093 X60708 


S79876 


Hs.44926 


133606 XS2043 


U10564 


Hs.75188 


129063 X63097 


X63094 


Hs£83822 


424460 X63563 


BE275979 


Hs.296014 


133227 X64037 


AW977263 


Hs.68257 


103181 X69636 


X69636 


HS334731 


103184 X69878 


043143 


Hs.74049 


103194 X70649 


NWL004939 


Hs.78580 



damage-specific DNA binding protein 2 (48kD) 

nuctear respiratory factor 1 

caspase 6, apoptosis-related cysteine protease 

branched chain aminotransferase 1, cytosofic 

eukaryotic translation initiation factor 28, subunit 5 (epsJJon, 82kD) 

SRY (sex determining region Y>box 11 

CCCTC-binding factor (zinc finger protein) 

stannbcaldn 1 

ESTs, Highly similar to Z169JWMAN ZINC FINGER PROTEIN 169 [H^aplsns] 
K1AA1641 protein 

gbdHuman rnyetornonocyfic specie protein (MNDA) gene, 5* flanking sequence and complete 
syntaxfri3A 

regulator of G-protein signalling 7 
N-myc (and STAT) interactor 
nocdin (mouse) homolog 

eukaryotic translation initiation factor 3, subunfi 2 (beta, 36kD) 

chromosome 1 1 open reading frame 4 

mitogen-actrvated protein kinase kinase 6 

proline arglnine-nch end leudne-rich repeat protein 

a disintegrin and rnetalloproteinase domain 9 (mefirtn gamma) 

homeobox A9 

DEAD/H (Asr>GIu-Ala-Asp/His) box polypeptide 10 (RNA heOcase) 
selenophosphate synthetase 2 

MAO (mothers against decapentaptegic, DrosophBa) homolog 4 

smafl nuclear RNA activating complex, polypeptide 1, 43kD 

fibroblast growfri factor 8 (androgen -induced) 

protein kinase, DNA-acfivated, catalytic polypeptide 

protein kinase C binding protein 1 

Human BRCA2 region, mRNA sequence CG00S 

von HppeUJndau binding protein 1 

cuin4B 

eyefic nucleotide gated channel beta 1 
cadherin 13, H-cadherin (heart) 
TRAP family member-associated NFKB activator 
ub)qurtin-Gke 1 (sentrin) 

caspase 7, apoptosis-related cysteine protease 

Homo sapiens cDNA: FU23037 fe, done LNG02036, highly similar to HS06801 9 Homo 

a disintegrin and rnetalloproteinase domain 17 (tumor necrosis factor, alpha, converting enzyme) 

karyopherin (importin) beta 2 

ATP/GTP-binding protein 

protein phosphatase 4, regulatory subuntt 1 

rrypothefcal protein MGC14433 

melanoma antigen, family A, 2 

procoBagen-ryshw, 2-oxogtuterate 5-dioxygenase (lysine hydroxylase) 2 
carboxypepfidase D 

cytosdic acyf coenzyme A thioester hydrolase 
adaptor-related protein complex 3, sigma 1 subunit 
thyroid hormone receptor interactor 13 
edimoderm mbrjtiJbulB-assodaled protein-Eke 
IGF-ilmRNA-binding protein 3 
coOagen, type I, alpha 2 
2,3-bfepnosphog tycerate mutase 
synaptophysln 
apoprotein A-l 

matrix rnetalloproteinase 10 (stromelysin 2) 

thrombospondin 1 

acid phosphatase 2, lysosomal 

methytene tetrahydrofolate dehydrogenase (NAD+ dependent), memenyftetrahydrofolate 

ankyrin 1, erythrocytic 
integrin, alpha 6 

rnuffirunctional polypeptide similar to SAICAR synthetase and AIR carboxylase 

placental growm factor, vascular endothelial growth factor-related protein 

5 1 nucleotidase (CD73) 

insuBrvEke growth factor 1 (somatomedin C) 

adenylate kinase 3 

dpep&fytpepfidase N (C026, adenosine deaminase compiexmg protein 2) 

wee1+ (S. pombe) homolog 

Rhesus blood group, D antigen 

pctymerase (RNA) D (DNA directed) polypeptide B (140k0) 

general transcription factor (IF, polypeptide 1 (74kD subunit) 

Homo sapiens, done IMAGE34483Q6, mRNA, parfial cds 

frns-feiated tyrosine kinase 4 

DEAD/H (Asp-Giu-Aia-Asp/Hs) box polypep&fe 1 
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1032)8 


X72841 


AW411340 


Hs.31314 


retjnobi3Stcni3"bindiriQ protein 7 


129698 


X74987 


8E242144 


Hi 12013 


ATP-hJnt&io cassette. sub-famDv E fQABP). member 1 


131485 


X83107 


F06972 


Hs.27372 


BMX non-receptor tyrosine kinase 


130729 


X84194 


AI963747 


Hs.18573 


acytphosphatase 1 , erythrocyte (common) type 


103334 


X85753 


NM 001260 


Hs 25283 


evcfrn-decendent kinase 8 

Vj will rvtpvi ww i\ iiuKMw w 


132645 


X87870 

AO* Of V 


AI654712 


Hs.54424 


hpnatocvte nucfsar factor 4 aloha 


135094 


X89066 


NM 003304 


Hs 250687 


transient receptor potential channel 1 


103352 


AOJJ030_.UJ«. 


H09366 


na«f oojj 


it rant-DMA a/voosvtass 


103353 




YftQ3QQ 


Hs 119274 


RAS o21 nrotebi activator fGTPase adh/aftna oroteinl 3 f Ins/1 3 4.5\P4-bindirra Drofetn) 


132173 




AOjntW 


Hs.41716 


pndnftififo! ceO-soselffc mnlpajfa 1 


103371 


X91247 


X91247 


Hs.13046 


fhWoHnYin tafacfasfl 1 

UIMOUVaIII ImMHWBwI 1 




A3 ioto 


/Vw900W9 




roirmp-nrh plement hJnffno nmtein A 


103375 




AL036166 


Hs.323378 


mated vestde msmbranfi orntein 


103378 


A3c 1 IV 


AL1 19690 


Hs.153618 


HCGVI1M orotein 


128510 


X94703 


X94703 


Hs.296371 


RAB28, member RAS oncogene family 


103410 


X96506 


AA158294 


Hs.334879 


DR1 -associated orotein 1 /nenative cof actor 2 aloha} 


133490 


X97230_f 


AF022044 


. Hs.274601 


killer cell fmmunra lobutirv-dke receotor three domains lona cvtoDlasmic taQ. 1 

Mild t^7ii mill m* w/fc^iiH rmw i vw^/wi | uuwwuviiuii«j iuin^ vjriwpjuwiiuv uui| • 


103438 


X98263 


AW175781 


Hs.152720 


M-phase phosphoprotBJn 6 


103440 


X98296 


X98296 


Hs.77578 


t rhimjiRn soerific cmtease 9 X chromosome fDrosoohfla fat facets related) 


103452 


X99584 


NM.006936 


Hs.85119 


SMT3 fsuDoressor of mif two 3 veasfl homoloa 1 




TUU£0*r 


W25797.comp Hs.177486 


arrnHntrf hpta /A4\ nrpnircnr nmtpfri fnmtpji^fi npyfnJI Alyhptmer diseased 


1351ft* 


YflTCfig 
I Uf JOO 


AW404908 


Hs.96038 


Ric m rnon nhfk? \-TJVp nynrpjKed In manv ficciipQ 


11R57* 


III/ 133 


Y07759 


Hs.170157 






YA7A97 


NM.007046 


Hs.284283 




1 JZ.UOJ 


T Uf 00/ 


BE386490 


Hs.279663 


ruut 


IIWDUU 




AW408009 


Hs.22580 




134389 


Y09858 


Y09858 


Hs.82577 


spindHn-Cke 


132084 


Y12394 


NK.002267 


Hs.3886 


karyopherin alpha 3 ftmportin alpha 4) 


103540 


211559 


NM.002197 


Hs.154721 


aconhase 1, soluble 


133152 


Z11695 


Z11695 


HU24473 


mitog ervactivated protein kinase 1 


103548 


Z15005 


215005 


Hs.75573 


centromere protein E (312kD) 


103612 


Z46261 


BE336654 


Hs.70937 


H3 histone femOy, member A 


129092 


AA011243_s 


D56365 


Hs.63525 


pc4y(rC)-bindbg protein 2 


103692 


AA018418 


AW137912 


HS227583 


Homo sapiens chromosome X map Xp1 1 .23 L-type calcium channel alpha-1 subunit 


(CACNA1F) gene, complete cds; HSP27 


pseudogene, complete sequence; and JM1 protein, JM2 protein, and Hb2£ genes, complete cos 


103695 


AA018758 


AW207152 


Hs. 186600 


ESTs 



129796 AA018804 
132258 AA031993 
132683 AA044217 
131887 AA046548 
member 1 

103723 AA057447^ 
453368 AA058376 
133260 AA083572 

103765 AA085696 

103766 AA088744 

103767 AA089688 
132K1 AA091284 
103773 AA092700 
[Celegans] 
135289 AA092968 
132729 AA094800 
103794 AA100219 
131471 AA1 14885 
134319 AA129547 
103807 AA133016 
1191© AA149507 
129863 AA151005 
103850 AA187101 
103855 AA195179„s 
322026 AA203138 
135300 AA203645 
103861 AA206236 
130634 AA227621 
[Cetegans] 
447735 AA248283 
103909 AA249611 
131236 AA282640 
134060 AA287199 
129013 AA313990 
129435 AA314256 
103988 AA314389 
104000 AA324364 
425234 AA32921U 
128629 AA399187 
133281 AA421079 



BE218319 
AA306325 
BE264633 
W17064 

BE274312 

W20296 

AA403045 

AA085696 

AI920783 

BE244667 

AA393968 

AI219323 

AW372569 

AW970843 

AF244135 

AA164842 

BE304999 

AW958264 

AF142419 

BE379765 

AA187101 

WQ2363 

AW024973 

AA142922 

AA20S235 

AJ769067 

AA775268 

AA249611 

AF043117 

D42039 

AA371156 

AF151852 

AA314389 

AI146527 

Af 155568 

ALQ96748 

AKD01601 



Hs^807 
Hs.4311 
Hs.143638 
Hs^32848 

Hs^l4783 

Hs^88178 

Hs.6906 

HS.1696D0 

Hs.191435 

Hs^96155 

Hs.180145 

Hs.101077 

Hs.9788 

Hs.55682 

Hs30670 

Hs.192619 

Hs.75653 

Hs.103832 

Hs.15020 

Hs.1»872 

Hs213194 

Hs^02267 

Hs.283675 

Hs.278626 

Hs.4944 

Hs.127824 

Hs.6127 

Hs.47438 

Hs.24594 

Hs.78871 

rte.107942 

Hs.111449 

Hs.42500 

Hs*0475 

HS.155489 

Hs.102708 

Hs.69594 



GTPaseRabH 

SUMO-1 activating enzyme subunit 2 
WD repeat domain 4 

SW1/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily e, 



Homo sapiens cONA RJ 14041 fe, clone HEMBA1 005780 
Homo sapiens cONA RJ11968 fis, done HEMBB1001133 
Homo sapiens cDNAj RJ23197 fe, clone REC00917 
K1AA0826 protein 
ESTs 

CGM00 protein 
HSPC030 protein 

ESTs, WeaWy similar to T223S3 hypothefical protein F47G9.4 - Caenorhabdirjs 

hypothetical protein MGC1 0924 similar to Nedd4 WW-bincfing protein 5 
eukaryotjc translation initiation factor 3, subunit 7 (zeta, 66767kD) 



K1AA1600 protein 

tumarate hydra tase 

similar to yeast UpG, variant B 

homotog of mouse quaking OKI (KH domain RNA binding protein) 
sperm associated antigen 9 
hypothetical protein MGC10895 
rryoothefical protein FU10330 
NPD009 protein 

Arg/Ab i-onteracting protein ArgBP2 
hypothetical protein FU 12783 

ESTs, Weakly similar to T28770 hypothetical protein W03DZ1 - Caenorhabdirjs 

Homo sapiens cDNA: FU23020 fis, clone LNG00943 
SH3 domain Wnding glutamic acid-rich protein 
ubiqufflnafem factor E4B (homologous to yeast UFD2) 
mesoderm development candidate 2 
DKFZP564M1 12 protein 
CGl-94 protein 
ADP-ribosytafion factor-like 5 

polymerase (RNA) II (DMA directed) polypeptide J (13.3kD) 
NS1-associated protein 1 
DKFZP434A043proteh 
hjgh-mobiKy group 20A 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



104104 AA422029 
[H .sapiens] 
108154 AA425230 
132091 AA447052 
135073 AA452000 
131367 AA456687 
129593 AA487015_s 
135266 AB002326 
133505 C01527 
132064 C01714 
134393 C01811J 
131427 C02352_S 
133435 C02375 
104282 C14448 
134827 D16611J5 
130443 D25216 
131742 D31352 
132837 D58024_s 
130377 D80897 
104334 D82514 
134593 D87845 
134731 D89377J 
129913 H06583 
131670 H40732 



AA422029 Hs.143640 ESTs, Weakly simiar to hyperpolarizatioa-aclivated cyclic nudeotids-gated channel hHCN2 



NMJJ05754 
AW954243 
W55956 
AI750575 
AI338247 
R41179 
AI630124 
AA121098 
W52642 
AF151879 
A1929357 
C14448 
BE314037 
D25216 
AA961420 
AA370362 
NNL014909 
D82614 
NM_000437 
D89377 
NM.001310 
H03514 
AA129551 
H56731 
AA306090 
N74724 
A1819448 
L36531 
M63154 
AW246273 
AW955705 
N56191 
AI038989 
AL044335 
AK000096 
AKD01676 
R22303 



AI091173 

AW452738 

AA040620 

AA923382 

F08282 

AF167706 

AW815036 

BE298665 



104394 H46617 
104402 H56731 
129781 H75570 
129077 H78886 
104417 H81241 
134927 L36531 
129280 M63154 
134498 M63180 
104460 M91504 
104488 N56191 
131248 N78483 

129214 N79268 
130017 R14652 
104530 R20459 
104534 R22303 
sequence. 
104544 R33779 
133328 R36553 
104567 R64534 
128562 R66475 
129575 R70621 
130776 R79356 
104599 R84933 
104660 RC.AA007160 
104667 RC_.AA007234._s AI239923 
104718 RCLAA01B409 AJ 143020 
104764 RC__AA025351 AI039243 

104786 RC_AA027168 AA027167 

104787 RC.AA027317 AA027317 
simitar to contains Atu repetitive element, 
134079 RC.AA029423 AK001751 
104804 RC_AAQ31357 
104865 RCLAAQ45136 
130828 RCLAA053400 
104907 RC_M055829 
WARNING ENTRY [H.sapiens] 
104943 RC..AA065217 AF072873 
105013 RCLAA116Q54 
105024 RC.AA126311 
132592 RC.AA129390 
105038 RCj*A130273 
105077 RC__AA142919 
105096 RC_AA150205 

129215 RC_AA176867 
105169 RCJVA180321 
132798 RC_AA1804B7 
130401 RCJ\A187634 
105200 RCJW95399 
130114 RC_AA234717 
105330 RC.AA234743 
105337 RCJW234957 
129385 RC.AA235604 



A1858702 
T79340 
AW631469 
AA055829 



HS3789 

AA126311 

AW803564 

AW5C3733 

W55946 

AL042506 

AB040930 

BE245294 

NWL006283 

BE396283 

AA328102 

AA233393 

AW338625 

AI468789 

AA172106 



Hs%22089 Ras-GTPase-activating protein SH3-domain-bind[ng protein 

Hs.170218 WAA0251 protein 

Hs£4030 Homo sapiens mRNA; cONA DKFZp586E1624 (from done DKFZp586E1624) 

Hs.1 73933 nuclear factor VA 

Hs.98314 Homo sapiens mRNA; cDNA DKFZp586L0120 (from done DKFZp586U)120) 

Hs.97393 WAA0328 protein 

Hs324504 Homo sapiens mRNA; cDNA DKFZp586J0720 (from done DKFZp586J0720) 

Hs.3838 semm-indudb!e kinase 

Hs.8261 hypofoetica] protein FU22393 

Hs-26708 CGI-121 protein 

Hs^23966 Homo sapiens done H63 unknown mRNA 

Hs^32333 EST 

Hs.89866 coproporphyrinogen oxidase (coproporphyria, harderoporphyria) 

Hs.155650 K1AA0014 gene product 

Hs31433 ESTs 

Hs^7958 EGF*TW4atrophiIin-felated protein 

Hs.155182 KJAA1036 protein 

Hs.78771 phosphogjycerate kinase 1 

Hs.234392 platelet-activating factor acetythydroiase 2 (40kD) 

Hs.89404 msh prosophlla) homeo box homobg 2 

Hs.13313 cAMP responsive element binding protein-fike 2 

Hs.10130 ESTs 

Hs.172129 Homo sapiens cDNA: FU21409 ns, done COL03924 

Hs.132956 ESTs 

Hs.124707 ESTs 

Hs.108479 ESTs 

H*320861 KruppeHike factor 8 

Hs.91296 integrin, alpha 8 

Hs.110014 gastric intrinsic factor (\ntamin B synthesis) 

Hs.84131 threonyHRNA synthetase 

Hs.62604 Homo sapiens, done IMAGE:4299322, mRNA. partial cds 

Hs.105511 protocadherin 17 

Hs.332633 Bardet-Biedl syndrome 2 

Hs. 1 09526 zinc finger protein 1 98 

Hs. 1 43 1 98 inhibitor of growth family, member 3 

Hs.12457 hypofoetical protein FU10814 

gbyh26b09.fi Scares placenta Nb2HP Homo sapiens cDNA done IMAG&130B41 5, mRNA 

H&222362 ESTs, Weakly similar to p40 [H .sapiens] 

Hs.265327 hypothetical protein DKFZp761 1141 

Hs.5672 hypometical protein AF140225 

Hs.101490 ESTs 

Hs.278428 progestin induced protein 

Hs.19280 cysleine-rich motor neuron 1 

Hs.151251 ESTs 

Hs.14846 Homo sapiens mRNA; cDNA DKFZp564D016 (from done DKFZp564D016) 

Hs.30098 ESTs 

Hs.36250 ESTs. Weakly similar to 138022 hypothetical protein [H .sapiens] 

Hs.278585 ESTs 

Hs.10031 KIAA0955 protein 

gfcze97cr11.s1 Soares_fetaLhearLNbHH19W Homo sapiens cDNA done IMAGE:366933 3* 
mRNA sequence. 

Hs.171835 hypothefical protein FU 10889 

Hs.31803 ESTs, Weakly similar to N-WASP [Hsapiens] 

Hs.22575 B-ceil CUJrymphoma 6, member B (zinc finger protein) 

Hs.203213 ESTs 

Hs.196701 ESTs, Weakly similar to ALULHUMAN ALU SUBFAMILY J SEQUBICE CONTAMINATION 

Hs.1 14218 frizzled (DrosophOa) homobg 6 

Hs.296288 ESTs. Weakly similar to K1AA0638 protein [H^apiens] 

Hs.9879 ESTs 

Hs-288850 Homo sapiens cDNA: FU22528 Ms, done HRC12825 

Hs.9414 WAA1488 protein 

Ks.234863 Homo sapiens cDNA FU 12082 fs, done HEMBB1002492 

Hs.21599 KruppeHfte factor 7 (ubiqutous) 

Hs.126085 KIAA1497 protein 

Hs.180789 S164 protein 

Hs.173159 transforming, acidic co2ed-cofl containing protein 1 

Ks.173987 eukaryofc transition initiation factor 3, suburb 1 (alpha, 35kO) 

Hs.24641 cytosketeton associated protein 2 

Hs.14992 hypothetical protein RJ1 1 151 

Hs22120 ESTs 

Hs23200 myotubuferin related protein 1 

Hs.1 10950 Rag C protein 
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105376 RCJW236559 
105397 RCLAA2428S8 
131962 RC^AA251776 
131991 RGjAA2519Q9 
5 128658 RC_M252672js 
105489 RCLAA256157 
105508 RCJK256680 
105539 RCJW258873 
135172 RCLAA262727 

10 ■ 131569 RC.M281451 
132542 RCJHA281545 
105643 RC_AA2320S9 
105559 RCLAA283044 
105666 RC.AA283930 

15 105674 RCJW284755 
105709 RCJW291268 
105722 RCJVA291927 
105765 RCJW343514 
115951 RCJW398109 

20 105962 RCJW05737 
105985 RCJU406610 
gbXQ2067 

106008 RCJW411465 
131216 RC AA416886 

25 134222 RCJW424013 
113689 RCJW424148 
106141 RCJ^A424558 
130839 RCJ^A424961_s 
108157 RCJW425367 

30 130777 RCJVA425921 
receptor 

130561 RCJM26220 
106196 RC_AA427735 
WARNING 

35 131878 RC.AA430673 
133200 RC.AA432248 
106302 RCJW435896 
108328 RCJW436705 
450534 RCJWM6561 

40 105423 RC.AA44823B 
133442 RC.AA448688 
439608 RC.AA449756 
106477 RC_AA450303 
106503 RC.AA452411 

45 446999 RCJW454566 
106543 RCJW454667 
130010 RCJ\A456437 
106589 RCJW56646 
105593 RC_AA456826 

50 106596 RCJM56981 
CONTAMINATION 
134655 RC^AA458^9 
member 1 

106636 RC.AA459950 

55 106654 RC^AA460449 
131353 RCJ\A463910 
106707 RC J\A464603 
131710 RCJ^A464606 
106717 RCJ\A465093 

60 131775 RCJW65692 
106747 RCJVA476473 
106773 RC_jAA478109 
105781 RC^AA478474 
106817 RC_AA480889 

65 106846 RCJ\A485223 
106648 RCJW485254 
106856 RCJM86183 
418699 RC_AA496936 
WARNING 

70 107001 RC.AA598589 
130638 RC_AA598831_f 
107054 RCJW600150 
107059 RC. J AA608545 
107080 RC^AA609210 

75 107115 RC.M610108 
107130 RC.AA620582 



AW994032 Hs.8768 hypothetical protein RJ10849 

AA814807 Hs.7395 hypothetical protein FU231 82 

AK000046 Ks^67448 hypothetical protein RJ20039 

AF053306 Hs.36703 budding uninhfeited by benzimMazptes 1 (yeast homofog), beta 

BE397354 Hs.324830 diptheria toxin resistance protein required for dlphthamide biosynthesis (SaocharomycesHike 2 

AA256157 Hs24115 Homo sapiens cONA FU14178 fis, done NT2RP2003339 

AA173942 Hs326416 Homo sapiens mRNA; cDNA DKFZp564H1916 (from clone DKF2p564H1916) 

AB040884 Hs.1 09634 WAA1451 protein 

AB028956 Hs.12144 WAA1033 protein 

AL389951 H&271623 nucteoporin 50kD 

AL137751 HsJ263671 Homo sapiens mRNA; cONA DKFZp434J0812 (from done DKFZp434l0812); parfial ods 

BE621719 Hs.173802 WAA0603 gene product 

AA283044 Hs^5625 hypotheticaJ protein FU 11 323 

AA426234 Hs.34306 ESTs, WeaWy am3arto T17210 hypothetical protein DKFZp434N041.1 [H^apiens) 

A1609530 Hs.279789 histone deacetytase 3 

A1928962 Hs.26761 DKF2P586L0724 protein 

AI922821 Hs.32433 ESTs 

AA299688 H&24183 ESTs 

BE546245 Hs.301048 sec134ike protein 

AW880358 Hs.339808 hypothetical protein FU 10120 

AA406610 gb:w15b10.s1 Soares_NhHMPu_S1 Homo sapiens cONA done IMAGE753691 J similar to 

AB033888 Hs.8619 SRY (sex detsmiinlng region Y)-box 18 

AI815486 Hs.243901 Homo sapiens cONA FU20738 fis, done HEP08257 

AW855861 Hs.8025 Homo sapiens done 23767 and 23782 mRNA sequences 

AB037850 Hs.16621 DKFZP434I1 16 protein 

AF031463 Hs.9302 phosdudn-Gxe 

AB011169 Hs.20141 similar to S. cerevisiae SSM4 

W37943 Hs.34892 WAA1323 protein 

AW135049 Hs.265418 Homo sapiens cONA FU10643 fis, done NT2RP2005753, highly simflarto Homo sapiens 1-1 

AB011095 Hs.16032 KIAA0523 protein 

AA525993 H&173699 ESTs, Weakly simflar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

AA083764 Hs.6101 hypometfcai protein MGC3178 

AB037715 Hs.183639 hypothetical protein FU10210 

AA398859 Hs.18397 hypothetical protein FU23221 

AL079559 Hs.28020 K1AA0766 gene product 

A1570189 Hs.25132 K1AA0470 gene product 

ABQ20722 Hs.16714 Rho guanine exchange factor (GEF) 15 

AL137663 Hs.7378 Homo sapiens mRNA; cDNA DKFZp434G227 (from done DKFZp434G227) 

AW864696 Hs.301732 hypoftetica) protein MGC5306 

R23324 Hs.41693 DnaJ (Hsp40) homotog, subfamfly B, member 4 

AB033042 Hs.29679 cofactor required for Sp1 transcriptional activation, subunrt 3 (130kD) 

AA151520 Hs.334822 hypofoetical protein MGC4485 

AA676939 Hs.69285 neuropflin 1 

AA301116 Hs. 142838 nudeolar phosphoprotein Nopp34 

AK000933 Hs.28661 Homo sapiens cDNA FU 10071 fis, done HEMBA1001702 

AW296451 Hs.24605 ESTs 

AA452379 Hs.293552 ESTs. Moderately similar to ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE 

AF265208 Hs.1 23090 SW1/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily f, 

AW958037 Hs.286 • ribosomaf protem L4 

AW075485 Hs.286049 phosphoserine aminotransferase 

AW754182 gb:RC2-CT0321-131199-011-c01 CTD321 Homo sapiens cONA, mRNA sequence 

AK000566 Hs.98135 hypothetical protein FU20559 

NKL015368 Hs^0985 pannexml 

AA600357 Hs.239489 TIA1 cytotoxic granuie-assoctetBd RNA-binding protein 

AB014548 Hs.31921 WAA0648 protein 

NM.007118 Hs.171957 triple functional domain (PTPRF interacting) 

AA478109 Hs.188833 ESTs 

AA330310 Hs^4181 ESTs 

D61216 Hs.1 8672 ESTs 

AB037744 Hs,34892 WAA1323 protein 

AA449014 Hs.1 21 025 chromosome 11 open reading frame 5 

W58353 Hs285123 Homo sapiens mRNA full length Insert cDNA done EUROIMAGE 2005779 

BE539639 Hs.173030 ESTs, WeakJy similar to AUJ8 _HUMAN ALU SUBFAMILY SX SEQUENCE CONTAMINATION 

A192S520 Hs31016 putative DNA binding protein 

AW021276 Hs.17121 ESTs 

A1076459 Hs.15978 KIAA1 272 protein 

BE614410 Hs23044 RAD51 (S. cerevisiae) homotog (E coD RecA hamobg) 

AL122043 Hs.1 9221 hypoiheOcaJ protein DKFZp566G 1424 

BE379623 Hs.27693 pepfidylprolyl isomerase (cydophiDnHike 1 

AB033106 Hs.12913 K1AA1280 protein 
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107156 RCJW621239 

107174 RC_AA621714 

130621 RCLAA621718 

107190 RCJM9S73 

5 132626 RC_D25755_s 

107217 RC D51095 

131610 RC_D6027ZJ 

* 129604 T08879 

10735 T34527 
10 (GalNAc-TI) 

107299 T40327_s 

107315 T62771_s 

107316 T63174.S ' 
107328 T83444 

15 107334 T93641 

134715 U48263 

128636 U49065 

129938 1/79300 

107375 U88573 

20 130074 U93867 

107387 W01094 

132036 W01568 

107426 W26853 

113857 W27179 

25 135388 W27965 

130419 W36280.S 

107469 W47063 

132616 W79060 

107506 W88550 

30 132358 X60486 

107522 X78931.S 

125827 Z14077.S 

107582 R<LAA002147 

107609 RC.M004711 

35 107661 RC.AA010383 

107714 RCJWJ15761 

107775 RC_AA018772 

107832 RC_AA021473_r 
sequoice. 

40 107859 RCJ\A024835 

124337 RCJWJ25858 

107914 RC.M027229 
[C.elegans] 

107935 RCJW029428 

45 116262 RCJVA035143 

131451 RC.AA035237 

108007 RC_AAC39347 

108029 RCJW040740 

108040 RC_AA041551 
50 member 1 

108084 RC.AA045513 

108088 RCLAA045745 

108168 RCJ\A055348 

130719 RC_AAG56582_s 

55 108189 RC_AA05S697 

108190 RC.AA056746 

1082)3 RCJUV057678 

108216 RCJU058681 

108217 RCJW058686 
60 108245 RC.M062840 

108277 RCJUV064859 
mRNA 

108280 RCJWJ65059 

108309 RC.AAQ69923 

65 133739 RC_AA070799_s 

108340 RC_AA070815 

108403 RC.AA075374 
3", mRNA sequence. 

108427 RC.AA076382 

70 3*. mRNA sequence. 

108435 RC_M078787 

108439 RC_AA078986 
3*, mRNA sequence. 

108465 RCJW079393 

75 108469 R<LAA079487 



AA137043 Hs.9653 programmed celi death 64nteracflng protein 

BE122762 HsJ25338 ESTs 

AW513087 Hs.16803 LUC7 (S. cerevisiae)-like 

AA836401 H<L5103 ESTs 

AW504732 Hs^1275 hypothetical protein FU11011 

AL060235 Hs.35861 DKFZP586E1621 protein 

AA357879 H&29423 scavenger receptor wife C-type lectin 

AF088886 Hs.1 1590 cathepsin F 

AA1 86629 Hs.80120 UDP-r4^atyt^!ph3^aladi^amine:pofypaplide N^tytgalactosaminyttrarisferase 1 

BE277457 Hs.30661 hyrjothetica) proteh MGC4606 

AA316241 Hs.90691 nudeophosrrurvnudeoplasmln 3 

T63174 Hs.1 93700 Homo sapiens mRNA; cDNA DKFZp586l0324 (from clone DKFZp586l0324) 

AW959891 Hs.76591 WAA0387 protein 

T93597 Hs.187429 ESTs 

U48263 Hs.89040 prepronociceptin 

U49065 Hs.102865 interieukin 1 receptor-IIke 2 

AW003S68 Hs.1 35587 Human done 23629 mRNA sequence 

BE011845 Hs.251064 high-mobility group (nonhistone chromosomal) protein 14 

AL038596 Hs250745 polymerase (RNA) 111 (DNA directed) (62kD) 

D86983 Hs.1 18893 Melanoma associated gene 

AL157433 Hs.37706 hypothetical protein DKFZp434E2220 

W26853 Hs.291003 hypothetical protein MGC4707 

AW243158 Hs.5297 DKFZP564A241 6 protein 

W27965 Hs.99865 epimorphin 

AF037448 Hs.155469 NS 1 -associated protein 1 

W47063 Hs.94668 ESTs 

BE262677 Hs.283558 hypothetical protein PR01855 

AB028981 Hs.8021 WAA1058 protein 

NM.003542 Hs.46423 H4 histone family, member G 

X78931 Hs.99971 zinc finger protein 272 

NM_003403 Hs.97496 YY1 transcription factor 

AA002147 Hs.59952 EST 

R75654 Hs.164797 hypothetical protein FU13693 

AA010383 Hs.60389 ESTs 

AA015761 Hs.60642 ESTs 

AW008846 Hs.60857 ESTs 

AA021473 gb:ze66c11.s1 Scares retina N2MHR Homo sapiens cDNA clone IMAGE363956 3', mRNA 

AW732573 Hs.47584 potassium voltage^ated channel, delayed-rectifier, subfamily S, member 3 

N23541 Hs.281561 Homo sapiens cDNA: FU23582 fe, done LNG13759 

AA027229 Hs.61329 ESTs, WeaWy similar to T16370 hypoftetical protein F45E12.5 - Caenomabditis elegans 

AA029428 Hs.61555 ESTs 

AI936442 Hs.59838 hypometical protein RJ10808 

AA992841 Hs.27263 WAA1458 protein 

AA039347 Hs.61916 EST 

AA040740 Hs.62007 ESTs. 

AL121031 Hs.159971 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily b, 

AA058944 Hs.1 16602 Homo sapiens, done IMAGE:4154008, mRNA, partial cds 

AA045745 Hs.62886 ESTs 

AI453137 Hs.63176 ESTs 

AA679262 Hs. 14235 hypothefical protein FU20008; WAA1839 protein 

AW376061 Hs.63335 ESTs, Moderately similar to A46010 X-Gnked retinopathy protein [H^aplens] 

AA056746 Hs.63338 EST 

AW847814 Hs.289005 Homo sapiens cDNA: FU21532 fe, done COL06049 

AA524743 Hs.44883 ESTs 

AA058686 Hs.62588 ESTs 

BE410285 Hs.89545 proteasome (prosome, macropain) subunit, beta type, 4 

AA064859 gbzm50f03.s1 Stratagene fibrobiast (937212) Homo sapiens cDNA done IMAGE:529085 3\ 

AA065069 gbzm12e1 1 .si Stratagene pancreas (937208) Homo sapiens cDNA done 3', mRNA sequence 

AA069818 gb:zm67eQ3 s\ Stratagene neuroep&hellum (937231) Homo sapiens cDNA done 5* snrflar to 

BE536554 Hs.278270 unacthre progesterone receptor, 23 kD 

AA069820 Hs.180909 percocinsdoxm 1 

AA075374 gb:zmB7a01 .si Stratagene ovarian cancer (937219) Homo sapiens cDNA done IMAGE 544872 

AA076382 gb:zm91g08.s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA done IMAGE'545342 

T82427 Hs.194101 Homo sapiens cDNA: FU20869 fe. done ADKA02377 

AA078986 gb:zm92h01.s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA done IMAGE545425 

AA079393 Hs.3462 cytochrome c oxidase subunit Vile 

AA079487 gtezm97f08.s1 Stratagene colon HT29 (937221) Homo sapiens cDNA done 3*, mRNA 



88 



WO 02/079492 



PCT/US02/04915 



108500 RC.AA083207 AA083207 

108501 RC.AA083256 AA083256 
gb:M333Q8 

108533 RCJWJB4415 AA084415 
5 mRNA 

108562 RCJW085274 AA100798 
gb^C15341 

108589 RCJWJ88678 AI732404 

130890 RCJW1 00925 A1907537 

10 134585 RCJW01255 D14041 

130385 RC_M126474 AW0S7800 

108749 RCJW127017 AA127017 

108807 RCJW29968 AIS52236 

108803 RCJVA130240 AA045088 

15 108833 RC_AA131866 AF188527 

107290 RCJVA132039 W27740 

108846 RCJW132983 AL117452 

108857 RCJVU33250 AK001468 

131474 RCJ\A133583_S L46353 

20 108894 RCJW135941 AK001431 

108941 RC_AA148650 AA148650 
IMAGE:567202 3\ 

108968 RCJW51110 AI304870 

108995 RCJVU55754 AW995610 

25 109001 RCJW56125 A1056548 

131183 RCJW56289 A1611807 

109019 RC_AA15S997 AA156755 

109022 RCJW57291 AA157291 

109023 RCJW57293 AA157293 
30 109068 RCJW164293J AA164293 

109072 RC.AA1 64676 A1732585 

129021 RCJW167375 AL044675 

130346 RC.AA167550 H05769 

109146 RC.M176589 AA176589 

35 109172 RC _AA180448 AA180448 

131080 RCJW187144_s NMJXJ1955 

129208 RC_AA189170J AI587376 

109222 RC_AA192757 AA192833 

109300 RCJVA205650 AA418276 

40 109481 RCLAA233342 AA878923 

109485 RCJW233472 BE619092 

109516 RCJW234110 AI471639 

109537 RCJ>80981 AI858695 

109556 RC_F01660 AI925294 

45 109577 RC_F02206 F02206 

109578 RC.F02208 F02208 

109595 RC_F02544 AA078629 

109625 RC.F03918 H29490 

131983 RCJ=04258js AF119565 

50 109648 RC.FO46O0 H17800 

109671 RC.F08998 R59210 

109699 RC_F09605 H18013 

109820 RC_F11115 AW016809 

109933 RCLH06371 R52417 

55 110014 RC_H10995 AL109666 

110039 RC_H11938 H11938 

110099 RC_H16568 R44557 

110107 RCJ16772 AW151660 

110155 RC.H18951 AI559526 

60 110197 RC_H20859 AW090386 

110223 RC_H23747 H19836 

110306 RC_H38Q87 H38087 

110335 RC_H40331 H65490 

110342 RC.H40567 H40961 

65 110395 RC_H469S6 AA025116 

110511 RC.H56640J H56640 

110523 RC.H57154 AI040384 

110715 RC K9S712 H95712 

110754 RC N20814 AW302200 

70 130132 RC_N25249 U55936 

131135 RC_N2710O NM.016569 

134263 RC.N39616 AW973443 

110938 RCJM8982 N48982 

110983 RC.N51957 NM.015367 

75 115062 RC.N52271 AA253314 

111081 RC_N59435 A1146349 



Hs.68270 EST 



Hs.68846 

Hs.76698 

H<lZ7B573 

Hs.155223 

Hs.71052 

Hs.49376 

Hs.62738 

Hs.61661 

Hs.323780 

Hs.44155 

Hs.62180 

Hs^726 

Hs.5105 



Hs.188580 

Hs.332436 

Hs.72116 

H&285107 

Hs.72150 

Hs^1479 

Hs.72168 

Ks.72545 

Hs.22394 

Hs.173081 

Hs.188757 

Hs.142078 

Hs.144300 

Hs.2271 

Hs.109441 

Hs^33512 

Hs.170142 

Hs^89069 

Hs28465 

Hs.71913 

Hs.34898 

Hs.87385 

Hs.296639 

Hs27214 

H5L27301 

Hs22697 

Hs. 184011 

Hs.7154 

H*l26634 

Hs.167483 

Hs.323795 

Hs^0945 

Hs.7242 

Hs.21907 

Hs.23748 

Hs.31444 

Hs.93522 

Hs.112278 

Hs.31697 

Hs.105509 

Hs.18845 

Ks.33008 

Hs.33333 

Hs.221460 

Hs.19102 

Ks.269029 

Hs.6336 

Hs.184376 

Hs.267182 

Hs.8086 

Hs.38034 

Hs.10267 

Hs.154103 

H&271614 



gb2n08g12^1 Stratagene hNT neuron (937233) Homo sapiens cDNA done J similar to 
gfczn06g09.s1 Stratagem* hNT neuron (937233) Homo sapiens cDNA clone IMAGE:546688 3 1 , 
gb2m26c06^1 Stratagene pancreas (937208) Homo sapiens cDNA clone 3" similar to 
ESTs 

stress-associated endoplasmic reticulum protein 1; ribosome associated membrane protein 4 
H-2K binding fedor-2 
stannlocaidn 2 
ESTs 

hypothetical protein FU20644 
ESTs 

ESTs, Weakly similar to AF174605 1 F-box protein Fbx25 [H^apiens] 
ESTs 

DKFZP585G1517 protein 

anMm (Drosophfla Scraps homotog), acfoi binding protein 
high-mobnity group (northistDne chromosomal) protein Isoform l-C 
hypofoetical protein FU10569 

gb:zo09e06.s1 Stratagene rieuroepitheBurn NT2RAM) 937234 Homo sapiens cDNA clone 

ESTs 
EST 

hypomefical protein FU20992 similar to hedgehog -Interacting protein 

hypothetical protein FU 13397 

ESTs 

ubbiuclein 1 

ESTs 

ESTs 

hypothetical protein FU10893 
WAA0530 protein 

Homo sapiens, clone MGC:5564, mRNA, complete cds 

EST 

EST 

endothefln 1 
MSTP033 protein 
similar to rat myomegafin 
ESTs 

hypothetical protein FU21016 

Homo sapiens cDNA: FU21869 fis, done HEP02442 

ESTs 

ESTs 

ESTs 

Homo sapiens potassium channel subunit (HERG-3) mRNA, complete cds 

ESTs 

ESTs 

ESTs 

pyrophosphatase (inorganic) 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens done 24993 mRNA sequence 

Homo sapiens mRNA fail length insert cDNA done EUROIMAGE 35907 

hfetone acetyttransferase 

ESTs 

ESTs 

Homo sapiens mRNA for K1AA1647 protein, partial cds 

arresfin, beta 1 

ESTs 

CTUgene 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to organic anion transporter 1 [H^apiens] 
ESTs 

WAA0672 gene product 
synaptosomaJ-assodated protein, 23 kD 
TBX3-iso protein 

RNA (guanine-7-) memyttransferase 

Homo sapiens cDNA RJ12924 Ms, done NT2RP2004709 

MM protein 

UM protein (similar to rat protein kinase C-btnding enigma) 
CGH 12 protein 
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111128 RC.N54139 

135244 RC N66981 

111216 RCJJ68640 

437562 RCJJ59352 

131002 RC.N95226 

111399 RCJW0138 

111514 RC_R07998 
similar to 

130182 RCJW8929 

111574 RC_R10307 

111804 RC.R33354 

111831 RCJB6083 

128875 RCJB7938J 

111904 RC.R39330 



133868 RC_R40816_s 

112033 RC_R43162_s 

130987 RCJW5698 

112300 RC_R54554 

112513 RC.R68425 

112514 RC_R68568 
112522 RC_R68763 
112540 RC_R70467 
130346 RC.R73565 
129534' RC_R73640 
112597 RC.R78376 
112732 RC.R92453 
131458 RCJT03865 
112888 RCJT03872 
131863 RC_T10072 
112911 RC.T10080 
132215 RC_T10132 
112931 RC.T15343 
112984 RC.T23457 
112998 RCJ23555 - 
133376 RCJT23670 
113026 RCJ23948 
113070 RC.T33464 
128970 RCJT34413 
113074 RC.734611 
113095 RC.T40920 
113179 RC.T55182 
113337 RCJ77453 
113421 RC.T84039 
113454 RCT85458 
113481 RC.T87693 
131441 RC.T89350_s 
113557 RCJ90945 
113559 RCJ90987 
113589 RCJT91863 
113591 RCJ91B81 
113619 RC_T93783_S 
113683 RCJT96687 
113692 RCJ98944 
113702 RCJ97307 
mRNA 

113717 RC.T97764 

113824 RCJAI48817 

113840 RCJ/V58343 

113844 RCLW59949 
PROTEIN TC10 

113902 RCJW74644 

113904 RC.W74761 

113905 RCJV74802 

113931 RC..W81205 

113932 RCJV81237 
131965 RCJV90146J 
114035 RCJY9Z798 
114106 RC^38412 
133593 RC_Z38709 
114161 RC_Z38904 
424949 RC_Z391Q3 
13059 RCJ39930J 
128937 RCJ39B2B 
WARNING 

130983 RCJi40012Li 



AW505384 

AI834273 

AW139408 

AB001636 

AL050295 

AW270776 

R07998 

BE267033 

AIQ24145 

AA482478 

R36095 

NM.015556 

241572 

AB012193 

R49031 

BE613269 

H24334 

R68425 



R68857 

R69751 

H05769 

AK002126 

R78376 

R92453 

BE297567 

AW195317 

AI656378 

AW732747 

AL035703 

T02966 

T16971 

H11257 

BE618768 

AA376654 

AB032977 

A1375672 

AK001335 

AA828380 

BE622021 

T77453 

AI769400 

A1022166 

TB7693 

AA302862 

H66470 

T79763 

AI078554 

T91881 

R08665 

AB035335 

AL360143 

T97307 

T99513 
AI631964 
R72137 
AI369275 

AA340111 

AF125G44 

R81733 

BE255499 

AA256444 

W79283 

W92798 

AW602528 



Hs.19074 

Hs.9711 

Hs.152940 

Hs.5683 

Hs.22039 

Hs.18857 



Hs.192853 
Hs.188526 
Hs.181785 



BE548222 
AF052212 
AW069534 
AA251380 



Hs.172180 



Hs.183874 

Hs£26Z7 

Hs.21893 

Hs.26125 

Hs.13809 

Hs.183373 

Hs.265499 

Hs.188757 

Hs, 11260 

Hs.29733 

Hs.34590 

Hs.27047 

Hs.107716 

Hs.33461 

Hs.13493 

Hs.4236 

Hs.167428 

Hs.269014 

Hs.22968 

HS7232 

Hs.183684 

Hs.6298 

Hs.165028 

Hs.31137 

Hs.126733 

Hs.152571 

Hs.302234 

Hs.189729 

Hs.16188 

Hs.204327 

Hs.90053 

Hs.16004 

Hs.14514 

Hs.15682 

Hs.200597 

Hs.17244 

Hs.144519 

Hs.17936 



Hs.187447 
Hs.34447 
Hs.7949 
Hs.243010 

Hs.100009 

Hs.19196 

Hs.33106 

Hs.3496 

Hs.126485 

Hs.35962 

Hs^69181 

H&238272 
Hs^99883 
Hs.153934 
Hsi279583 
Hs.10726 



LATS (large tumor suppressor, DrosophOa) homotog 2 

novel protein 

ESTs 

DEAD/H (Asp-GIu-AIa-Asp/His) box polypeptide 15 

K1AA0758 protein 

ESTs 

gb:yf16g11.s1 Soares fetal Bver spleen 1NFLS Homo sapiens cDNA done IMAGE:1 27076 3* 

ubiquitin-ccnjugating enzyme E2G 2 (homologous to yeast UBC7) 

ESTs 

ESTs 

ESTs 

WAA0440 protein 

gb:HSCZYB122 normalized Infant brain cDNA Homo sapiens cDNA clone c-zyb12, mRNA 

cufiin4A 
ESTs 

hypomefoal protein DKFZp761N0824 
ESTs 

hypothetical protein FU 10648 

src homology 3 domain-containing protein H1P-55 

ESTs 

gfcyi40a10.s1 Soares placenta Nb2HP Homo sapiens cDNA clone 3\ mRNA sequence 

Homo sapiens, clone MGC:5564, mRNA, complete cds 

bypoffiefca) protein FU1 1264 

EST 

ESTs 

hypothetical protein FU20392 
rrypoffietfcal protein FU 22344 
ESTs 

like mouse brain protein E46 
K1AA0478 gene product 
ESTs 

ESTs, Weakly similar to A43932 mucin 2 precursor, intestinal [H^apiens] 

Homo sapiens clone IMAGE:451939, mRNA sequence 

acetyl-Coenzyme A carboxylase alpha 

euka/yotrc translation initiation factor 4 gamma, 2 

WAA1151 protein 

ESTs 

protein tyrosine phosphatase, receptor type, E 
ESTs 

ESTs, Highly similar to IGF-II mRNA-bindlng protein 2 [H.saptens] 

ESTs 

ESTs 

ESTs 

EST 

neurocalcin delta 
ESTs 
ESTs 
ESTs 

K1AA0563 gene product 
hypothefical protein FU13605 
T-cell teukemia/lymphoma 6 
DKFZP434H132 protein 

gb:ye53h05.s1 Soares fetal Over spleen 1NFLS Homo sapiens cDNA done IMAGE:121497 3', 

ESTs 
ESTs 

DKFZP586B2420 protein 

Homo sapiens cDNA FU14445 fe. clone HEMBB1001294, highly similar to GTP-BINDING 

acyi-Coenzyme A oxidase 1 , palmhoyl 
ubiquifin-conjjgating enzyme HBUCE1 
ESTs 

hypothefical protein MGC15749 

hypothefcal protein FU 12604; KIAA1692 protein 

ESTs 

ESTs 

gfcRC5^TQ562-26010f>01 1 -A02 BT0562 Homo sapiens cDNA, mRNA sequence 
inositol 1,4,5-tnphosphate receptor, type 2 
hypothetical protein FU23399 

core-binding factor, runt domain, alpha subunit % translocated to, 2 
CGI-81 protein 

ESTs, WeaJdy similar to ALU1 J^UMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 



AW79813 Hs^7B411 NCK-^socfeled protein 1 
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114277 RC_Z40377_S 
[C.elegans] 

114304 RCJAW20 

114364 RC_Z41680 

132900 RCJ\A005112 

129034 RC_M005432 

131881 RCJW10163 

452461 RCJVA02635S 

114465 RCJ^026901 

131376 RCJWJ36867 

101567 RCJWW4644 

431555 RCJ\A046426 

132944 RC_AA054515 

114618 RCJW084162 

130274 RC.AA085749 

110330 RC_AA098874 

114648 RCJ\A101056 
IMAGE:5484293* 

114658 RC M102746 

132456 RCLAA114250. 

131319 RCJW26561. 

132225 RC J AA128980. 
IMAGE:5671643' 

132669 RCLAA129757 

114709 RCJW29921 

131973 RCJW133331 

114750 RCJW135958 

115714 RCJ\A136524_ 

114763 RCJVA147044 

114767 RC.AA14B885 

114774 RCJ\A150043 

129388 RCJ\A151621 

129183 RCJW155743 

128869 RCJWI56335 

130207 RCJ\A156336 

114798 RCJW159181 

114600 RC_AA159&25 



AI052229 


HS75373 


AO34204 


Hs.16129 


AL117427 


Hs. 172778 


AA777749 


Hs£978 


AA481157 


Hs.108110 


AW361018 


Hs3383 


N78223 


Hs.108106 


BE621056 


Hs.131731 


AKD01644 


Hs^6156 


M33552 


Hs^6729 


A1815470 


Hs^60024 


T96641 


Hs.6127 


AW979261 


Hs.291993 


AA12B376 


Hs.153884 


AI288666 


Hs.16621 


AA101056 




AA102383 


Hs^49190 



.s AB011084 
s NM.003155 
i AA126980 

W38586 
AA397651 
AB018284 
AA887211 
.ST19228 
AA810755 



114628 RC_AA234185 
114846 RCJ*A234929 
114848 RC_AA234935 
114902 RCJU236359 
132271 RCJ\A236466 
114907 RCJ\A236535 
135159 RCJ\A236935_s 
132204 RCJ\A236942 
114928 RCJv\237018 
132481 RC.AA237025 
114932 RC.AA242751 
314162 RCLAA242760 
131006 RCJ\A242763 
114935 RCJIA242809 
WARNING 

132454 RC^AA243133 
437754 RC_jAA243495 
114957 RCJW243706 
114974 RCJU250848 
114977 RC_AA250868 
114995 RC.AA251152 
115005 RCJ\A251544_s 
417177 RCJ\A251792 
131889 RC.AA252063 
115026 RCLAA252144 
115045 RC.M252524 
115068 RC_M253461 
133138 RCJ\A255522 
RECEPTOR, 
115114 RC.AA256468 
129584 RCJ\A255528 
115137 RC_AA257976 
134312 RCJ\A258296 

115166 RCJ\A258409 

115167 RC.AA258421 
129807 RCJU262077 
115239 RCJW278650 
115243 RCLAA278766 



AV656017 
AA662477 
BE561824 
AA768242 
AF044209 
AA159181 
219448 

AA252937 

BE016662 ' 

BE614347 

AW275480 

AB030034 

N29390 

U43374 

AA235827 

AA237018 

W93378 

AA971436 

BE041820 

AF064104 

H23329 

BE296227 

R60366 

AW170425 

AW966931 

AW296978 

AA769266 

AI760825 

NNL004458 

NM.002589 

AA251972 

AW014549 

AW512260 

AV657594 

AA527548 

AV656017 

AW968304 

AB011151 

AF095727 

AA749209 

Y11192 

BE251328 

AA806SO0 



Hs.48924 
Hs.25590 



Hs.293981 

Hs.301959 

Hs.158688 

Hs.129467 

Hs.172572 

Hs.88977 

Hs.154443 

Hs.184325 

Hs.110964 

H3J273369 

Hs.60618 

Hs.144904 

Hs.54900 

Hs.131887 

Hs.283522 

Hs.166196 

Hs.169615 

Hs.39504 

Hs.115175 

Hs.13804 

Hs.95631 

HS42265 

HSS4869 

Hs.49614 

Hs.16218 

Hs.38516 

Hs^2116 

Hs29dm 

H&250822 

Hs.5822 

Hs^7S80 

Hs.179662 

Hs^7787 

Hs.193657 

Hs. 111 339 

Hs.81452 

HS34073 

Hs.188718 

Hs^8373 

Hs.87767 

Hs.181161 

Hs.7527 

Hs.184325 

Hs.56156 

Hs.334659 

Hs.287832 

Hs.43728 

Hs.5299 

Hs.73291 

Hs.1 16665 



ESTs, Weakly similar to T20410 hypothetical protein E02A10.2 - Caenorhabdife etegans 
ESTs 

Homo sapiens mRNA; cDNA DKFZp566P013 (from clone DKFZp566P013) 

UM domain only 7 

OKFZP547E21 10 protein 

upstream regulatory element binding protein 1 

transcriptjon factor 

hypothetical protein FU11099 

hypothetical protein FU 10782 

lysosomal 

Cdc42 effector protein 3 

Homo sapiens cDNA: RJ23020 fis, clone LNG00943 
ESTs 

ATP binding protein associated with ceo differentiation 
DKFZP434I116 protein 

gbzn25b03.s1 Stratagene neuroepithelium NT2RAMI 937234 Homo sapiens cDNA done 

tumor necrosis factor receptor superfamity, member 10a 
K1AA0512 gene product ALEX2 
stannfocatein 1 

gfczo09a1 1 .si Stratagene neuroepBheiium NT2RAMJ 937234 Homo sapiens cONA done 

guanine nudeofide binding protein (G protein), gamma 3, linked 
proline synthetase co-transcribed (bacteria) homoiog) 
KIAA0741 gens product 
ESTs 

hypothetical protein RJ20093 

hypothetical proteh dJ51 1 E1 6^ 

rrurochromosorne maintenance deficient (S. cerevtsfae) 4 

CGI-76 protein 

hypothetical protein FU 23471 

uncharacterized hematopoietic stem/progenitor cells protein MDS027 

hypothetlcaJ protein 

nuclear receptor co-repressor 1 

serologically defined colon cancer antigen 1 

ESTs, Weatfy similar to T24396 hypoftetical protein T03F6.2 - Caenorhabdife elegans 

Homo sapiens mRNA; cDNA OKFZp434J1912 (from done DKFZp434J1912) 

ATPase, Class I, type 8B, member 1 

hypothetical protein RJ20989 

hypothetical proteti MGC4308 

sterits-alpha motif and leucine zipper containing kinase A2K 

hypothetical protein dJ4620232 

Human normal keratinocyte mRNA 

ESTs 

ESTs 

ESTs 

WAA0903 protein 

Homo sapiens, done MGC:15887, mRNA, complete cds 
CDC 14 (ceD division cycle 14, S. cerevisiae) homoiog B 

ESTs, WeaWy similar to ALU1JJUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 
serine/threonine kinase 15 

Homo sapiens cDNA: FU22120 fis, clone HEP18674 
ESTs 

nudeosome assembly protein Hike 1 

ESTs 

ESTs 

ESTs 

fatty-add T Coenzyme A Dgase, long-chain 4 

BH-protocadhenn (brain-heart) 

ESTs 

ESTs 

ESTs 

Homo sapiens cDNAfU14643 lis, done NT2RP2Q01597, weakly similar to RYANODINE 

small fragment nuclease 
CGl-76 protein 
ESTs 

hypothetical protein MGC14139 
myelin protein zero-like 1 
hypothetical protein 

aldehyde dehydrogenase 5 famDy, member A1 (succinate-semJalo^fryde dehydrogenase) 
hypothetical protein FU10881 
WAA1842 protein 
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100850 RCJttZ79667_s 
126884 RCLAA280791 
115322 RCJW28Q819 
133826 RC.AA280828 
5 115372 RCJVA282195 
132825 R<LAA283127_s 
130269 RC_AA284694 
129192 RCJVA291137 
452598 RC.AA291708 

10 WARNING 

132131 RCJW293495 
115536 RC.AA347193 
132411 RCJ\A398474ji 
115575 RCJ\A398512 

15 115601 RCJ\A400277 
WARNING 

103928 RCJW0Q896 
125819 RCJW404494 
115683 RC_jAA410345 
20 115715 RCJ\A416733 
132952 RCJ\A425154 
115819 RCJU426573 
132525 RCJW431418 
115895 RCJW436182 . 
25 132333 RC.AA437099 
115962 RCJ\A446585 
115967 RCJ\A446887 
115974 RCJW47224 
115985 RC.AA447709 
30 129254 RC_AA453624 
133071 RCJ\A455044 
116095 RCJW56045 
122691 RCJW160454_s 
116210 RC.M476494 
35 116213 RCJ\A476738 
134585 RCLAA481422 
134790 RCJW482269 
116265 RCj\A482595 
129334 RCJW85084_s 
40 116274 RCj\A485431_s 
303150 RC_jAA489057 
129945 RC.AA489638 
116331 RCJW491000 
116333 RCJW491250 
45 132994 RCLAA5Q5133 
134577 RCJW598447 
116391 RC_AA599243 
116394 R<LAA599574J 
134531 RC^AA600153 
50 116417 RC.AA609309 
116429 RCJW6Q9710 
116439 RCJ\A610068 
116459 RC_AA621399 
427505 RCJ\A621752 
55 132699 RC.C21523 
116541 RC.D12160 
132557 RCLD197Q8 
112259 RCJ)25801 
116571 RCJM5652 
60 sequence. 

129815 RC_D60208J 
421919 RC_D80504_s 
116643 RC.F03010 
116661 RCJW247 



65 remove 



70 



75 



116715 RCLF1Q966 

116729 RG.F13700 

318709 RC_K05063 

134760 RC.H16758 

116773 RC_H17315_s 

106425 RC.H22556 

116780 RC_H22566 

131978 RCJW459_s 

116819 RO.H53073 

111428 RC_H56559j3 

133175 RC_H57957_s 



AA836472 

U49436 

L08895 

AW836130 

AW014385 

U82671 

F05422 

AA286914 

A1831594 

AF069291 
AK001468 
AA059412 
AA393254 
AA148984 

D14540 

AA044840 

AF255910 

BE395161 

AI658580 

AA486620 

AW292809 

AB033035 

AA192669 

A1636361 

AI745379 

BE513442 

AA447709 

AA252468 

BE384932 

AA043429 

R19768 

BE622792 

AA292105 

D14041 

BE0Q2798 

BE297412 

AW157022 

AI129767 

AA887146 

BE514376 

N41300 

AF155827 

AA1 12748 

BE244323 

T86558 

NMJXJ6Q33 

AI742845 

AW499664 

AF191018 

AA251594 

R80137 

AA361562 

AW449822 

D12160 

AA114926 

AA337548 

D45652 

BE565817 
AJ224901 
AB67044 
R61504 

AL1 17440 

BE549407 

R52576 

NM.000121 

AI823410 

H24201 

H225*ffi 

AA355925 

H53073 

AL031428 

AW955632 



Hs37939 

Hs285236 

Hs.78995 

Hs.75277 

Hs.88678 

H&576S8 

Hs.168352 

Hs.183299 

Hs.68647 

Hs.40539 
Hs.62180 
Hs.47986 
Hs.43619 
Hs.48849 

Hs.199160 

Hs^51871 

Hs.54650 

Hs.1390 

Hs.61426 

Hs.41135 

Hs.50727 

Hs.51965 

Hs.45032 

Hs.179520 

Hs.42911 

Hs.238944 

Hs.268115 

Hs.1098 

Hs.64313 

Hs.62618 

Hs.172788 

Hs.172788 

Hs.326740 

Hs^78573 

Hs.287850 

Hs.55189 

Hs.4947 

Hs.182674 

Hs.8217 

Hs.165998 

Hs.71616 

Hs.203953 

Hs.279905 

Hs.85951 

Hs.75113 

Hs.65370 

Hs.110713 

Hs.12484 

Hs.279923 

Hs.43913 

Hs.302738 

Hs.178761 

Hsi5200 

Hs.249212 

Hs.5122 

Hs.333402 



Hs.26493 

Hs.109526 

Hs.153638 



Hs.170263 

Hs.115823 

Hs.285280 

Hs.89548 

Hs.169149 

Hs247423 

Hs30098 

Ks.38232 



WAA1856 protein 

MADS box transcription enhancer factor 2, polypeptide C (myocyte enhancer factor 2C) 

hypothetical protein FU 13910 

ESTs, Weakly similar to Unknown (Rsapiens) 

Empirically selected from AFFX single probeset 

nudeoporin-Gke protein 1 

ESTs 

ESTs, Weakly similar to ALU7 JiUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION 

chromosome 8 open reading frame 1 

anffin (Drosophlia Scraps homotog), acfin binding protein 

hypoiheCcal protein MGC10940 

ESTs 

ESTs, WeaWy simflar to ALU4.HUMAN ALU SUBFAMILY SB2 SEQUENCE CONTAMINATION 
myeloid/lymphoid or mixed-lineage leukemia (tri thorax (Drosophila) homobg) 



Hs.174174 



junctional adhesion molecule 2 

proteasome (prosome, rnacropain) subunit. beta type, 2 

Homo sapiens mesenchymal stem cefl protein DSC96 mRNA, partial cds 

endomucin-2 

N^retylgtocosaminidase, alpha- (Sanfflippo disease IllB) 

WAA1209 protein 

ESTs 

hypothetical protein MGC10702 
ESTs 

hypomefical protein FU 10631 

ESTs, Weakly simBar to T08599 probable transcription factor CA150 [H .sapiens] 
DKFZp434J1813 protein 

ESTs, Weakly similar to AF257182 1 (protein-coupled receptor 48 [HsapiensJ 
ESTs 

ALEX3 protein 
ALEX3 protein 

hypothetical protein MGC10947 

H-2K binding factor-2 

integral membrane protein 1 

hypothetical protein 

hypothetical protein FU22564 

guanine nucleotide bindfog protein (G protein) alpha 12 

stromal antigen 2 

PAM mRNA-binding protein 

Homo sapiens mRNA; cDNA DKFZp586N1720 (from clone DKFZp586N1720) 

hypothetical protein FU10339 

doneHQ0310PRO0310p1 

exporfin. tRNA (nuctear export receptor for tRNAs) 

general transcription factor MA 

Bpase, endofoeBal 

DEK oncogene (DNA binding) 

Human done 23826 mRNA sequence 

putative nucleotide binding protein, estradtoWnduced 

PIBF1 gene product 

Homo sapiens cDNA: FU21425 fis, done COL04162 
26S proteasome-associated pad1 homobg 
ESTs 

polymerase (RNA) III (DNA directed) (155kD) 
ESTs 

hypothetical protein MGC12760 

gb:HUMGS02848 Human adult lung 7 directed Mbol cDNA Homo sapiens cDNA 3, mRNA 

hypoflietical protein FU21657 
zinc finger protein 198 

myelokl/Iymphotd or mixed-Oneage leukemia 2 

gb.yh16a03.sl Scares infant brain 1N1B Homo sapiens cDNA done 3* similar to contains Afu 

tumor protein p5^bmding protein, 1 

rfbonudease P, 40kD subunit 

Homo sapiens cDNA: FU2209S 6s, done HEP15953 

erythropoietin receptor 

karyopherin alpha 1 firnportm alpha 5) 

addudn 2 (beta) 

ESTs 

KlAA0186gene product 
EST 

WAA0601 protein 

ESTs, Weakry simBar to S19560 profine-rich protein MP4 - mouse [Mjnuscutus] 
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116844 RC_H64938j> 

116845 RC.H64973 
116892 RC_H69535 
116925 RC_H73110 

5 116981 RC_H81763 

131768 RC_H86259 

117031 RC.H88353 
contains L1 

117034 RC_H88639 

10 132542 RC.H88675 

134403 RC.H93708JJ 

117280 RC_N22107 

117344 RCJC4046 

117422 RC.N27028 

15 117475 RC_N30205 

117487 RCJB0621 

130207 RC_N33258 

117549 RC.N33390 

117683 RC.N40180 

20 IMAGE:2763873 , simSarto 

117710 RC.N45198 

104514 RC_N45979_s 

117791 RCJW8325 

117822 RCJM6913 

25 129647 RCJM9394 

117895 RC.N50655 
(H.sapiens] 

131557 RCLN50721 

133057 RC.N53143 

30 118103 RCJI55326 

118111 RC.N55493 
mRNA 

118129 RC_N57493 
IMAGE^773583*,mRNA 

35 118278 RCLN62955 

118329 RC.N63520 
3\mRNA 

118336 RCJI63604 

132457 RC_N64166 

40 118363 RCJJ64168 

118364 RC_N64191 

118475 RC.N66845 
similar to 

118491 RCJ467135 

45 118500 RC.N67295 

101663 RC_N68399 

118584 RC.N68963 
sequence 

421983 RC.N69331 

50 118661 RC.N70777 

118684 RC_N71364_s 

118689 RC_N71545_S 

118690 RC.N71571 
118766 RC.N74455 

55 118793 RCJJ75594 

118817 RCJJ79035 

118844 RCJI80279 

118919 RC_N91797 

129558 RC.N92454 

60 132692 RC.N94581 

118996 RC.N94746 

119021 RC_N98238 

119039 RC_R02384 

119063 RC.R16833 

65 WARNING 

118523 RCJW1828.S 

119111 RCJW3203 

133970 RC.R46395 

119146 RC.R58B63 

70 120296 RCJ*78248 

119239 RC.T11483 
sequence. 

119281 RC.T16896 

119298 RCJ23820 

75 126502 RCJ3Q222 

135073 RCJV15275_s 



AA649530 
AI573283 
H73110 
N29218 
AC005757 
H88353 

072209 
AL137751 
AA334551 
Ml 8217 
R19085 
AI355562 
N30205 
N30621 
AF044209 
N33390 
N40180 

N45198 
AF164622 
N48325 
AA706282 
AB018259 
AW450348 

AA317439 
AA465131 
AA401733 
N55493 

N57493 

N62955 
N63520 

BE327311 
AB017365 
AI183838 
N46114 
N66845 

AV647908 



NWL003528 
AW1 36928 

AI252640 
AL137554 
N71313 
AW390601 
N71571 
N74456 
N75594 
A1668658 
AL035364 
AW452696 
AW580922 
AW191952 
N94746 
N98238 
A1160570 
R16833 

Y07759 
T02865 
AA214228 
R58863 
AW995911 
T11483 

A1692322 
NMJJ01241 
T10077 
W55956 



HS337434 

t&38458 
Hs^6Q6Q3 
Hs.40290 
Hs318Q9 



Hs.180324 

Hs.263671 

Hs.82767 

Hs.172129 

Hs.210706 

Ks.43880 

Hs.93740 

Hs.44203 

Hs.144904 

Hs.44483 



Hs.47248 

Hs.182982 

Hs.93956 

Hs.93963 

Hs.118140 

Hs.93996 

Hs.28707 
Hs.64001 
Hs.184134 



Hs.316433 



Hs.47166 
Hs.173859 
Hs.48938 
Hs.29169 



Hs.90424 

Hs.154329 

Hs.2178 



Hs.110364 

Hs.49927 

Hs.163986 

Hs.184544 

Hs.269142 

Ks.50499 

Hs.285921 

Hs.50797 

Hs^0891 

Hs.130760 

Hs.180446 

Hs.249239 

Hs.274248 

Hs.55185 

Hs.252097 

Hs.53106 

Hs.170157 

Hs.328321 

Hs.1Z7751 

Hs.91815 

Hs.299883 



Hs.65373 
Hs.155478 
Hs.13453 
Hs.94030 



ESTs, Weakly similar to A46010 X-linked retinopathy protein [H^apiens] 
gb'Jis44f05^1 NCL_CGAP_Alv1 Homo sapiens cONA done. mRNA sequence 
ESTs 

ESTs, Moderately similar to A47582 B-cell growth factor precursor [H.sapiens] 
ESTs 

hypothetical proteki 

gb7w21a02^1 Morton Fetal Cochlea Homo sapiens cDNA done IMAGE252842 3* similar to 
YY1-associated factor 2 

Homo sapiens mRNA: cDNA DKFZp434I0812 (from done DKFZp434i0812); partial cds 
sperm spedfc antigen 2 

Homo sapiens cONA: FU21409 fis, done COL03924 

Homo sapiens cDNA HJ 13182 6s, done NT2RP3004070 

ESTs, Weakly similar to A46010 X-linked retinopathy protein [H.sapiens] 

ESTs, Weakly similar to I38022 hypothetical protein [H .sapiens] 

ESTs 

nudear receptor co-re pressor 1 
EST 

gbryy44d02s1 Soares.multiple.scterosis^bHMSP Homo sapiens cDMA done 

EST s, Highly similar to similar to Cdc14B1 phosphatase [H.sapiens] 

golgin-67 

EST 

ESTs 

KJAA0716 gene product 

ESTs, Highly similar to SORUttJMAN SORTIUN-RELATED RECEPTOR PRECURSOR 

signal sequence receptor, gamma (translocon-associated protein gamma) 

Homo sapiens clone 25218 mRNA sequence 

ESTs 

gb7v50c02^1 Scares fetal liver spleen 1 NFLS Homo sapiens cONA done IMAGE246146 3 1 , 

gb:yy54c08.s1 Soares_multlp!e_sderosls_2N bHMSP Homo sapiens cDNA done 

Homo sapiens cDNA RJ 1 1375 fis, done HEMBA100041 1 , weakly similar to ANKYRIN 
gb:yy62l01.s1 Soares_multiple_sderosis_2NbHMSP Homo sapiens cDNA done IMAGE:278137 

HT021 

frizzled (Drosophfla) homolog 7 
hypothetical protein FU21802 
hypothetical protein FU 22623 

gbza46c11.s1 Scares fetal Over spleen 1NFLS Homo sapiens cONA clone IMAGE295604 3' 

Homo sapiens cDNA: FU23285 fis, clone HEP09071 
ESTs 

H2B histone family, member Q 

gb.Ul-H^)1^D>cW)8^Ul.s1 NCI_CGAP_Sub3 Homo sapiens cDMA done 3', mRNA 

peptidytproiyl isomerase C (cydophilin C) 
protein kinase NYD-SP15 

Homo sapiens cONA: FU22765 fis. done KAIA1 180 
Homo sapiens, done IMAGE3355383, mRNA, partial cds 
ESTs 
EST 

ESTs. Moderately similar to T47135 hypothetical protein DKF2p761L0812.1 [H^apiens] 
ESTs 



myosin phosphatase, target subunit 2 
karyopherin (imporfin) beta 1 
collagen, type VIII, alpha 2 
hypothetical protein FU 20758 
ESTs 

pregnancy specific beta-1 -glycoprotein 6 

ESTs, Moderately similar to ALU1.HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

myosin VA (heavy polypeptide 12, myoxin) 
EST 

hypothetical protein 
ESTs 

hypothetical protein HJ23399 

gb:CHR90049 Chromosome 9 exon Homo sapiens cDNA done 1 1 1-1 5* and 3", mRNA 

ESTs, Weakly similar to T02345 hypothetical protein K1AA0324 [H^apiens] 
cydinT2 

hypothetical protein FU14753 

Homo sapiens mRNA; cDNA DKFZp585E1624 {from done DKFZp586E1624) 
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R3S972 

AW082866 

Z40182 

240904 

AW959615 

AA167500 



AKD00061 
AA177105 
AA179656 

AA188175 

AJ235885 

AA837098 

A1216292 

AW295096 

T57776 



119558 RC W38194 W38194 

132736 RC_W42414_S AW081883 
sapiens mad protein 

132173 RC_W46577_s X89426 

5 134873 RC_W49632_s AA884471 

119650 RC.W57613 R82342 

119654 RC.W57759 W57759 
similar to 

119683 RCLW61118 W65379 

10 119694 RCJV65344 AA041350 

119718 RC W69216 W69216 

133010 RCJV59379 AI287518 

119938 RC_WB6728 AW014862 

120128 RC_Z38499 BE379320 

15 120130 RCJZ38630 AA045767 

120148 RC.Z39494 F02806 

120155 RC.Z39623 

131486 RC_Z40071_s 

120183 RC.Z40174 

20 120184 RC_Z40182 

120211 RC^Z40904 

120245 RCJ^166965 

120247 RCLAA167500 

120254 RCJV\169599_s W90403 

25 120259 RCJW171724 AW014786 

120260 RCJ\A171739 

120275 RC_AA177105 

120284 RCJW1B2626 

tooontains 

30 114056 RC J AA186324 

129507 RC_M192099 

120302 RCJVA192173 

120303 RCJ\A192415 
120305 RC_AA192553 

35 120319 RC_AA194851 

133389 RCJW195520_s AA195764 

120326 RCJ^196300 AA196300 

134272 RCJ\A196517 X76040 

133145 RCJW196549 H94227 

40 120327 RCLAA196721 AK000292 

106686 RCJW196729J N66397 

120328 RC.AA196979 AA923278 

120340 RC_AA206828 AA206828 
similar to 

45 134292 RCLAA207123 AJ906291 

131522 RC.AA214539J AI380040 

129051 RC_AA226914_s AA227068 

120375 RC_ J AA227260 AF028706 

120376 RC_AA227469 AA227469 
50 IMAGE«637323 , ,mRNA sequence. 

120390 RCJW233122 AA837093 

303876 RCJW233334_s U64820 
dominant ataxin 3) 

132038 RCJW233347 AJ825842 

55 104463 RCJW233519 T85825 

125750 RC.AA233714 AA01B515 

120396 RCJW233796 AA134006 

120409 RCJVV235050J AA235050 
gb!07077 

60 12)414 RC.AA235704 AW137156 

120420 RCJ\A236031 AI128114 

120422 RC_jAA236352 AL1 33097 
132221 RC_ J AA236390_s W94915 

120423 RCJW235453 AA236453 
65 120435 RC -J AA243370 AA243370 

120453 RC_AA250947 AA250947 

120455 RC.M251083 AA251720 

120456 RC.M251113 AA488750 
120473 RCJW251973 AA251973 

70 128922 RC.AA252023 AI244901 

12)477 RCJ\A252414 AA252414 

12)479 RC J AA252650 AF006689 

120488 RCJ\A255523 AW952916 

120510 RCJK258128 AI796395 

75 120527 RCJ^A262105 AA262105 

120528 RCJW262107 AI923511 



Empirically selected from AFFX single probeset 

Hs288261 Homo sapiens cONA: HJ23037 fis, done LNG02036, highly similar to HSU68019 Homo 

Hs.41716 endothelial ceO-sperific molecute 1 

Hs.90449 Human done 23908 mRNA sequence 

Hs.79856 ESTs, Weakly similar to S65657 alpha-1 ^adrenergic receptor spBce form 2 [H.sapiens] 

gbzd20g1U1 Soares_fetaLhearLNbHH19W Homo sapiens cDNA done IMAGE:341252 3 1 

Hs.57835 ESTs 

Hs.57847 ESTs, Moderately similar to ICE4_HUMAN CASPASE4 PRECURSOR [Rsapiens] 

Hs.92848 ESTs 

Hs.62669 Homo sapiens mRNA; cDNA DKFZp586D0923 (from done DKFZp586D0923) 

Hs.58885 " ESTs 

Hs.91448 MKP-1 Bee protein tyrosine phosphatase 

Hs.5300 bladder cancer associated protein 

Hs.65765 ESTs 

Hs.65783 ESTs 

Hs.27372 BMX non-receptor tyrosine kinase 

Hs.65882 ESTs 

Hs.65885 EST 

Hs.66012 EST 

Hs.1 11045 ESTs 

Hs.103939 EST 

Hs.1 11054 ESTs 

Hs.192742 hypothetical protein RJ12785 

Hs.1 01590 hypothetical protein 

Hs78457 solute carrier family 25 (mitochondrial carrier; ornithine transporter) member 15 

gbzp54e1 1 .s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA done 3* similar 

Hs.82506 WAA1254 protein 

Hs.112160 zinc finger protein 148 (pHZ<52) 

Hs.269933 ESTs 

Hs.95184 ESTs 

Hs.101337 uncoupfoig protein 3 (mitochondrial, proton carrier) 

Hs.191094 ESTs 

Hs.72639 ESTs 

H&21 145 hypothetical protein RG083MQ5.2 

Hs.278614 protease, serine. 15 

Hs.6592 Homo sapiens, clone [MAGE3961368, mRNA, partial cds 

Hs.278732 hypothetical protein FU20285 

Hs.334825 Homo sapiens cDNA FU14752 fis, clone NT2RP3003071 

Hs.290905 ESTs, Weakly similar to protease [H.sapiens] 

gbzq80b08.s1 Stratagene hNT neuron (937233) Homo sapiens cDNA done IMAGB647895 3* 

Hs.81234 ImmunogtobuQn supenamily, member 3 

H&239489 T1A1 cytotoxic granule-associated RNA-txnding protein 

Hs.1 08301 nudear receptor subfamily 2, group C, member 1 

Hs.1 11 227 Zic family member 3 (odd-paired DrosophOa homolog, heterotaxy 1 ) 

gb2r18a07.s1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA clone 

Hs. 1 1 1 460 caldurn/calmoluIirKleperKlent protein kinase (CaM kinase) II delta 

Hs.66521 Machado-Joseph disease (spinocerebellar ataxia 3, olivopontocerebellar ataxia 3, autosomal 

Hs.3776 zinc finger protein 216 

Hs^46885 hypothetical protein FU20783 

Hs.264482 Homo sapiens mRNA; cDNA DKFZp761A0411 (from done DKFZp761A0411) 

Hs.79306 eukaryotic translation initiation factor 4E 

gbzs38e04.s1 Soares_NhHMPu_S1 Homo sapiens cDNA done IMAGE687486 3 1 similar to 

Hs.181202 hypothetical protekiaj10038 

Hs.1 12885 spinal coni-cterived growffi fector-B 

Hs.301717 hypoftetical protein DKFZp434N1928 

Hs.42419 ESTs 

Hs.18978 Homo sapiens cDNA: FU22822 fe, clone KA1A3968 

Hs.96450 EST 

Hs.170263 tumor protein p53^bxfing protein, 1 

Hs.104347 ESTs, Weakly similar to ALUC.HUMAN HI! ALU CLASS C WARNING ENTRY 1!! [Rsapiens] 

Hs.88414 BTB and CNC homology 1, basic leucine zipper transcription factor 2 

HIL269988 ESTs 

Hs.9589 ubiquifin 1 

Hs.43141 DKFZP727C091 protein 

Hs.1 10299 mftogen-activated protein kinase kinase 7 

Hs.63510 KIAA0141 gene product 

Hs.1 11377 ESTs 

Hs.4094 Homo sapiens cDNA FU14208 6s, done NT2RP3003264 

Hs.104413 ESTs 
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120529 RCJ\A262235 
120541 RC_ J AA278298 
131445 RC.AA278529J 
120544 RCJW278721 
5 120562 RCJVA280036 
120569 RCJVA280648 

120571 RC^AA280738 

120572 RC_ J AA280794 
129434 RC.M280837 

10 130529 RCJ\A280886 
repetitive 

120575 RC_AA280934 
132635 RCJVA281535 
120591 RCuAA281797^s 

15 120593 RC.M282047 
430275 RC.M283002 
117729 RC.AA283709 
12)609 RCJ\A2B3902 
132754 RCJ\A284108 

20 130315 RCJW284109 
132614 RC.AA284371 
447503 RCJW284744J 
cds 

135376 RCJW284784 
25 120521 RCLAA284840 
107868 RC_AA285844 
129868 RCJW287032 
120644 RCLAA28703B 

120660 RCJ\A287546 
30 135370 RCLAA287553_s 

120661 RC.AA287556 
129116 RCAA2875S4 
131567 RCJW291015.S 
120699 RCJ\A291716 

35 100690 RCJ\A291749_s 
120726 RC_AA293656 
120737 RC_AA3Q2430 
120745 RC_AA302809 
135192 RC_AA302820_s 

40 120750 RCJW31Q499 
120761 RC_AA321890 

120768 RCJVA340589 

120769 RC.AA340622 
135232 RCJ\A342457J 

45 CONTAMINATION 

133439 RCJW342828.S 
120793 RCJW342864 
120796 RCJVA342973 
120809 RC.AA346495 

50 repeat mRNA sequence. 
132459 RCLAA347573 
120825 RCJ\A347614 
120827 RCJW347717 
120839 RCJW348913 

55 repeat mRNA sequence. 
120850 RCJ\A349647 
120852 RCJW349773 
128852 RCJ\A350541_S 
135240 RC.AA357159J 

60 120870 RCJW357172J 
WARNING 

134637 RCLAA369856.S 
120894 RCJ\A370132 
131854 RCJ\A370472_s 
65 120897 RC_AA370867 
120915 RC J AA377295 

120935 RC^AA383902 
WARNING 

120936 RC_AA385934 
70 120937 RCJ\A386255 

120938 RC^AA386260 
129722 RC_jAA386266 
120960 RCJW398014 
120985 RC.AA398222 
75 120988 RCJW398235 



A1434823 Hs.104415 ESTs 

W07318 Hs.240 M-p^phosphopfDtsin 1 

NM.014264 Hs.172052 sertie/ftreoriine kinase 18 

BE548277 Hs.103104 ESTs 

BE244580 Hs.302267 hypothetical protein FU 10330 

AA807544 Hs.24970 ESTs, WeaWy similar to B34323 GTP-binding protefn Rab2 [H^aptens] 

AB037744 Hs.34892 KIAA1323 protein 

H39599 Hs.294008 ESTs 

AW9S7495 Hs.1 86644 ESTs 

AA178953 gbzp39eQ3.s1 Stratagene muscle 937209 Homo sapiens cDNA done 3 1 similar to contains Ahi 

AW978022 Hs238911 hypothetical protein DKFZp762E1 51 1; K1AA1 816 protein 

AB020686 Hs.54037 edonudeotide pyrophosphatase/phosphodiesterase 4 (putative function) 

AF078847 Hs.1 91355 genera) transcriptton factor HH, pofypepfjde 2 (44kD subunit) 

AA748355 Hs.193522 ESTs 

Z1 1773 Hs.237786 zinc finger protein 187 

AA306166 Hs.7145 caipain7 

AW978721 Ks.265076 ESTs, Weakly similar to A46010 X-linked retinopathy protein [Haptens] 

AI752244 Hs.75309 eukaryotic transition elongation factor 2 

AI241084 Hs.1 54353 nonselecfive sodium potassium/proton exchanger 

AA284371 Hs.1 18064 similar to rat nuclear ubiquitous casein kinase 2 

AA1 15496 Hs.336898 Homo sapiens, Similar to RIKEN cDNA 1810038N03 gene, clone MGC:9890, mRNA. complete 

BE617856 Hs.99756 mitochondrial ribosome recycling factor 

AW961294 Hs.143818 hypothetical protein FU23459 

AA286844 Hs.61260 hypothetical protein FU13164 

AW172431 Hs.13012 ESTs 

A1869129 Hs.96616 ESTs 

AA286785 Hs.99677 ESTs 

BE622187 Hs.99670 ESTs, WeaWy similar to 138022 hypothetical protein [H.sapiens] 

AA287556 Hs.263412 ESTs, WeaWy similar to ALUB_HUMAN UIl ALU CLASS B WARNING ENTRY !!! [H^apiens] 

AB019494 Hs.225767 IDN3 protein 

AF015592 Hs.28853 CDC7 {cell division cycle 7, S. cerevisiae, homologHitel 

A1683243 Hs.97258 ESTs, Moderately similar to S29539 ribosomal protein L1 3a, cytosolic [Rsapiens] 

AA383256 Hs.1657 estrogen receptor 1 

AA293655 Hs.97293 ESTs 

AL049176 Hs.82223 chortfin-Bke 

AA302809 gb£ST10426 Adipose tissue, white I Homo sapiens cDNA 3* end, mRNA sequence. 

U83993 Hs.321709 purinergic receptor P2X, Bgand-gated ton channel, 4 

A! 19141 0 Hs.96693 ESTs, Moderately similar to 2109260A B ceil growth factor [H^apiens] 

AA321690 Hs.1 265 branched chain keto acid dehydrogenase E1, beta polypeptide (maple syrup urine disease) 

AA340589 Hs.104560 EST 

AI769467 Hs.96769 ESTs 

AL038812 Hs.96800 ESTs, Moderately similar to ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE 

Z23091 Hs.73734 glycoprotein V (platelet) 

AA342864 Hs.96812 ESTs 

A1247356 Hs 96820 ESTs 

AA346495 ' ^ gb£ST52657 Fetal heart II Homo sapiens cDNA 3 1 end similar to EST containing O family 

AL1 20071 Hs.48998 fibronectin leucine rich transmembrane protein 2 

A1280215 Hs.96885 ESTs 

AA382525 Hs.132967 Human EST clone 122887 mariner transposon Hsmarl sequence 

AA348913 . gbf ST55442 Infant adrenal gland II Homo sapiens cDNA 3' end similar to EST containing AIu 

AA349647 Hs.96927 Homo sapiens cDNA RJ12573 frs, clone WT2RM40D0979 

AA349773 Hs.191564 ESTs 

R40622 Hs.106601 ESTs 

AA357159 Hs^6986 EST 

AA357172 Hs292581 ESTs, Moderately similar to ALU1.HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

U87309 Hs.180941 vacuolar protein sorting 41 (yeast homotog) 

AA370132 Hs.97063 ESTs 

AF229839 Hs.1 73202 (-kappa-B-interacting Ras-Eke protein 1 

AA370867 Hs.97079 ESTs, Moderately similar to AF174605 1 F-box protein Fbx25 [H^apiens] 

AL135556 Hs.97104 ESTs 

AL048409 Hs.97177 ESTs, WeaWy similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

AA385934 Hs.97184 EST. Highly simi&r to (define not available 7499603) [C.elegans] 

AA386255 Hs.97186 EST 

AA386260 Hs.104632 EST 

R20855 Hs.5422 glycoprotein M6B 

AA398014 Hs.104684 EST 

AI219895 Hs.97592 ESTs 

AA398235 Hs.97631 ESTs 



95 



WO 02/079492 



PCT/US02/04915 



121008 RC.AA338348 
GSSsandaCpG 

121029 RC_AA398482 

121032 RCJVA398504 

121033 RC AA398505 

121034 RCJIA398507 

121035 RCJK398523 
121058 RC_M39B625 

121060 RCLM398632 

121061 RCJ\A398633 

121091 RCJ\A398894 
CONTAMINATION 

121092 RCJ\A398895 
121094 RCJ\A398900 
121096 RCM39B904 
121115 RCJ\A399122 

121121 RC.AA399371 

121122 RC.AA399373 
121125 RC_AA399441 
121151 RCJ\A399636 
121153 RC.AA399840 
121163 RC_AA399630 
121176 RC_AA400080 
121192 RC.AA400262 
121223 RC _AA4(X)725 
121227 RC.AA4Q0748 
121231 RCJ\A400780 

121278 RCJW01631 

121279 RCJU401688 
121282 RCJ*A401695 
121299 RCJW402227 

121301 RCJ\A402329 

121302 RC.AA402398 

121304 RCJ\A402449 

121305 RC_AA402468 
134721 RCJVU03268. 

121323 RC_AA403314 

121324 RC.AA404229 
129047 RC.AA404260 
131074 RC.AA404271 
121344 RC_AA405026 
121348 RC.jAA4Q5182 
121350 RC_AA405237 
contains Alu 

121400 RCJUU06061 

121402 RC.AA406063 

121403 RC_AA406070 
121408 RC_AA4Q6137 
121431 RCJW406335 
132936 RC.AA411801 
121471 RCJ^411804 
121474 RCJM11833 
121526 RC_M412219 

.121530 RC.AA412259 

121558 RC.AA412497 
contains L1.0L1 

121559 RCJW412498 
121584 RC_AA416586 
121609 RC^AA416867 
121612 RCJW416874 
121737 RC.AA421133 
121740 RCJW421138 
129194 RC>A422079 
121784 RC_AA423837 

121802 RC_AA424328 

121803 RCJW424339 
135286 RCJW424469. 
121806 RCJVA424502 
1E517 RC.AA425004 
121845 RC_AA425734 
COMTAMINATION 
121853 RC.AA425887 
121891 RC.AA426456 
121895 RCJK427396 
similar to contains 
121899 RCJW427555 



AA398348 Hs.301720 Human ONA sequence from clone RP1 1-251 J8 on chromosome 13 Contains ESTs, STSs, 

AA398482 Hs.97641 EST 

AA393037 Ks.161798 ESTs 

AA398505 Hs.97360 ESTs 

AL389951 Hs.271623 nucteoporin 50kD 

AA398523 Hs.210579 ESTs 

AA398625 Hs.97391 ESTs 

AA398632 Hs.97395 ESTs * 

AA393288 Hs.97396 ESTs 

AA398894 Hs!97657 ESTs, Moderately simBar to ALU8_HUMAN ALU SUBFAMILY SX SEQUENCE 

AA398895 Hs.97658 EST 

AA402505 gbzi62h10.r1 SoaresJesfcJIHT Homo sapiens cDNA clone 5\ mRNA sequence 

AA398904 Hs.332690 ESTs 

AA398187 Hs. 104682 ESTs, Weakly similar to mitochondrial citrate transport protein [H^aplens] 

AA399371 Hs.189095 similar to SALL1 (sal (DrosophilaHike 

A1126713 Hs.192233 ESTs, Highly similar to T00337 hypothetical protein K1AA0568 [H^apiens] 

AL042981 Hs.251278 K1AA1201 protein 

AA399636 Hs.143629 ESTs 

AA399640 Hs.97694 ESTs 

AI676062 Hs.1 11902 ESTs 

AL121523 Hs.97774 ESTs 

AA400262 Hs. 190093 ESTs 

A10021 10 Hs.97169 ESTs. Weakly similar to OM667H1 2£1 [H.sapiens] 

AA400748 Hs.97823 Homo sapiens mRNA; cDNA DKFZp434D024 (from clone DKFZp434D024) 

AA814948 Hs.96343 ESTs, Weakly stmflar to ALUC .HUMAN I!!! ALU CLASS C WARNING ENTRY III [H.sapiens] 

AA037121 Hs.98518 Homo sapiens cONA RJ1 1490 fe, clone HEMBA1001918 

AA292873 Hs.177996 ESTs 

AA401695 Hs.97334 ESTs 

AA402227 Hs.22826 tjopomodulin 3 (ubiquitous) 

NM.006202 Hs.89901 phosphodiesterase 4A, cAMP-spedfic (dunce prosophila)-homolog phosphodiesterase E2) 

AA402587 Hs.325520 LAT1-3TM protein 

AA293863 Hs.97316 EST 

AA402468 Hs.291557 ESTs 

s AK000112 Hs.89306 hypotheScal protein FU20105 

AA291411 Hs.97247 ESTs 

AA404229 Hs.97842 EST 

AI768623 Hs.108264 ESTs 

U16125 Hs.181581 glutamate receptor, ionotropic, kainate 1 

AA405026 Hs.193754 ESTs 

AA405182 Hs.97973 ESTs 

AA405237 gb:zt06e10.s1 NCLCGAPJ3CB1 Homo sapiens cDNA clone IMAGE:712362 3' similar to 

AA40606t Hs.98001 EST 

AA406083 Hs.98003 ESTs 

AA406070 Hs.98004 EST 

AA406137 Hs.98019 EST 

AA035279 Hs.176731 ESTs 

AL120659 Hs.6111 aryRiydrocarbon receptor nuclear transJocator 2 

AA411804 Hs.261575 ESTs 

AA402335 Hs.188760 ESTs, Highly similar to Trad [H^apiens] 

AW665325 Hs.98120 ESTs 

AA778658 Hs.98122 ESTs 

AA412497 gb:z!95g1Zs1 Soaresjestis _NHT Homo sapiens cDNA clone IMAGE:730150 3' similar to 

A1192044 Hs.104778 ESTs 

AI024471 Hs.98232 ESTs 

AA416867 Hs38185 EST 

AA416874 Hs.98168 ESTs 

AA421133 Hs. 104671 erythrocyte transmembrane protein 

AA421138 Hs.98334 EST 

, AA150797 Hs.109276 latexin protein 

T90789 Hs34308 RAB35, member RAS oncogene family 

AI251870 Hs.188898 ESTs 

AI338371 Hs.157173 ESTs 

s AW023482 Hs£7849 ESTs 

AA424313 Hs.98402 ESTs 

AW972853 Hs, 11 2237 ESTs 

AI732692 Hs. 165066 ESTs, Moderately similar to ALU2_HUMAN ALU SUBFAMILY SB SEQUENCE 

AA425887 Hs.98502 riypothefical protein FU14303 

AA426456 Hs.98469 ESTs 

AA427396 gbzw33a02£l Scares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:771050 3" 



R55341 HJL50421 K1AAQ203 gene product 
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121917 RCJU428218 AA406397 

121918 RCJUV428242 BE2746E9 

121919 RCLAA428281 AA428281 
121941 RC^AA428885 AA428855 

5 121942 RCLAA428994 AW452701 

121970 RC^AA429565 AA429668 

121993 RC_jAA430181 AW297880 

134660 RC_M430184_s U73524 

126753 RCLAA431288.S AA306478 

10 122022 RCLAA431293 AA431293 

122050 RC_AA431478 AI453076 

122051 RC.M431492 M431492 
122055 RC.AM31732 AA431732 
122105 RC.AA432278 AW241685 

15 122125 RCJ^A434411 AK000492 

135235 RCJ^A435512J AW298244 

122162 ' RCJVA435698 AA628233 

129406 RCLAA435711 AB018255 

318801 RC J AA435815_S U40763 

20 122186 RC_jAA435842 AA398811 

122235 RC.AA436475 AA436475 

129131 RC JW436489 AB026436 

134664 RCJU442060 AA256106 

122310 RCLAA442079 AW192803 

25 122334 RCJU443151 BE465894 

122382 RC_AA446133 AA446440 

122425 RC_jAA447145 AB007859 

122431 RCJM47398 AA447398 

122450 RCJ\A447643 AA447643 

30 302653 RCJU447742.S AJ404468 

122477 RC_M448226 AA448226 

122500 RCAA448825 AA448825 

122522 RC_AA449444 AA239607 

122536 RCJU450087 AF060877 

35 122538 RCJWI50211 AA450211 

122540 RC.jAA450244 AA476741 

122560 RC.AA452123 AW392342 

421919 RCJ\A452155 AJ224901 

122562 RCLAA45215B AA452156 

40 mRNA 

122585 RC_jAA453036 AI681654 

122608 RC.AA453526 AA453525 

122635 RCJVA454085 AA454085 
similar to 

45 122636 RCLAA4541Q3 AW651705 

122653 RCJM54642 AW009166 

122660 RC_AA454935 A1816827 

122703 RCJ\A456323 AA456323 

122724 RC_AA457395 AA457395 

50 122749 RCJW458850 AA458850 

122772 RO.AA459662 AW117452 

131098 RC.AA459668 U66669 

129045 RCJVA459679J5 AI082883 

122777 RC.AA459702 AK001022 

55 135362 RC.AA460017J AA978128 

122798 RC_AA4S0324 AW3 56286 

122837 RCJ\A461509 AA461509 

122860 RCLAA464414J AA464414 
mRNA sequence. • 

60 122861 RCLAA464428 AA335721 

122910 RC_AA47Q084 AA470084 

132899 RC_AA476606_s AA476606 

122967 RCLAA478521 AA806187 

129560 RCJW478523 AA317841 

65 123009 RCJ\A479949 AA535244 

128917 RCJW481252 AB55215 

123081 RCJVA485351 AI815486 

123133 RCJW487264 AA487264 

123184 RCJW189072 BE247767 

70 13671 RCJW489630 NM_014700 

123233 RCLAA490225 AW974175 
[H.saptens] 

123234 RCJW490227 NNL001938 
123236 RCJW90255 AW968504 

75 123255 RCJ\A490890 AAB30335 

129503 RC_M490916_s AW768399 



Hs.184175 

Hs.98560 

Hs.98563 

Hs.293237 

Hs.98617 

Hs.98661 

Hs.87465 

Hs.95327 

Hs.98716 

Hs.166109 

Hs.98742 

Hs.98747 



Hs.98806 

Hs.293507 

Hs.79946 

Hs.111138 

Hs.77965 

Hs.104673 

Hs. 11 2227 

Hs.177534 

Hs.87507 

Hs.98974 

Hs.98365 

Hs.98643 

Hs.100955 

Hs.99104 

Hs.112095 

H^284259 

Hs.324123 

Hs.99190 



Hs.99236 

Hs.99239 

Hs.98279 

Hs283&T7 

Hs.109526 



Hs.170737 
Hs.143077 



Hs.99519 

Hs.99376 

Hs.180069 

Hs.269369 

Hs.99457 

Ks.293372 

Hs.99489 

Hs.236642 

Hs.30732 

Hs.214397 

Hs.99513 

Hs.145696 

Hs.293565 



Hs.119394 
Hs.98358 



Hs.289101 

Hs.7645 

Hs.78305 

Hs.206097 

Hs243901 

Hs.154974 

Hs.18166 

Ks. 119004 

Hs.188751 

Hs.16697 
Hs.123073 
Hs.105273 
Hs.112157 



ESTs 

chromosome 2 open reading frame 3 

EST 

ESTs 

ESTs 

EST 

ESTs 

ATP/GTP-cinding protein 

CD3D antigen, delta polypeptide (TTT3 complex) 

ESTs, Moderately similar to T42650 hypothetical protein DKFZp434D0215.1 [Haptens] 

ELAV (embryonic lethal, abnormal vision, Drosophfla)-Gke 2 

EST 

EST 

ESTs 

hypothetical protein 
ESTs 

cytochrome P450, subfamily XIX (aromatization of androgens) 

WAA0712 gene product 

peptkfyl-proryl isomerase G (cydophOin G) 

ESTs 

membrane-associated nucleic acid binding protein 
10 



ESTs 

ESTs, Weakly similar to S65824 reverse transcriptase homotog [H^apiens] 

ESTs, Weakly similar to LB4D_HUMAN NADP-OEPENDENT LEUKOTRJENE B4 12- 

ESTs 

K1AA0399 protein 
ESTs 

hypofoefcal protein DKF2p434F1819 

dynein, axonemal, heavy polypeptide 9 

ESTs 

ESTs 

ESTs 

regulator of G-protem signalling 20 
ESTs 

ESTs, WeaWy similar to A43932 mucin 2 precursor, intestinal [Haptens] 
cerrtrosomal P4.1 -associated protein; uncharacterized bone marrow protein BM032 
zinc finger protein 198 

gfczx29c03.s1 Soares - totaLtetus_Nb2HF8_9w Homo sapiens cDNA clone IMAGE:787876 3', 

hypothetical protein HJ 23251 
ESTs 

gbzx33a08.$1 SoaresJotaLfetusJ4b2HF8_9w Homo sapiens cDNA clone iMAGE:788246 3' 

hypothetical protein FU 14007 
ESTs 

nuclear respiratory factor 1 

ESTs 

ESTs 

ESTs, Weakly similar to B34087 hypothetical protein [H sapiens] 
ESTs 

3^ydrc«yisobutyry}^rizyme A hydrolase 

hypothetical protein FU 13409; KIAA1711 protein 

hypothetical protein RJ10160 similar to insulin related protein 2 

ESTs, Weakly similar to T17454 diaphanous-related fbrmin - mouse [M.muscu!us] 

splicing factor (CC1 3) 

ESTs, Weakly ssrtQar to putative pl50 [Rsapiens] 

gtozx78g01 .si Scares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE:809904 J, 

ESTs 
ESTs 

SMAD in the anfeense orientation 
glucose regulated protein, 58kD 
hypothetical protein MGC2752 
RAB2, member RAS oncogene Camay 
oncogene TC21 

Homo sapiens cDNA FU20738 fe, done HEP08257 

Homo sapiens mRNA; cDNA DKFZp657N064 (from clone DKFZp667N064) 

WAAQ870 protein 

WAA0655 gene product 

ESTs, Weakly similar to MAPB JHUMAN M1CROTUBULE-ASSOC1ATED PROTEIN 1B 

down-regulator of transcription 1 , TBP-bindmg (negative cofactor 2) 

CDC2-reiafed protein kinase 7 

ESTs 

ESTs 



97 



WO 02/079492 



PCT/US02/04915 



131043 RC.AA490925 AF084535 Hs^2464 

123259 RCJ^490955 AT744152 H<l283374 
[Haptens] 

123284 RC_AA495812 AA488988 Hs^93796 

5 123286 RCJ*A495824 AA495824 Hs.188822 

123315 RCAA496369 AA4983S9 
to contains 

129179 RCLAAStWttSji AW959025 Hs.109154 

131612 RCJW521473 AU076668 Hs.334884 

10 123421 RCJ\A593440 AA598440 Hs^91154 

123449 RC.AA598899J AL049325 Hs.1 12493 

129021 RC.M599244 ALQ44675 Hs.173081 

132830 RCJ*A599694_s NMJ)14777 Hs.57730 

123497 RC^AA600037 AA765256 Hs.135191 

15 123604 RCJVA609135 AA609135 Hs^93076 

129539 RCLAA609582 T47614 Hs.323022 

123712 RCAA609684 AA609684 Hs.112748 

123731 RCJW609839 AA609839 
similar to 

20 130725 RCJW609862 T98807 Hs.80248 

123800 RCJW620423 AA620423 Hs. 11 2882 

123841 RCLAA620747 AA620747 Hs.112896 

123929 RC_M621364 AA621364 Hs. 11 2981 

123978 RC_C20653 T89832 Hs.170278 

25 133184 RCJ320085 AA001021 Hs.6685 

132835 RCJD20749 283844 Hs.5790 

132406 RC_D51285_s AL133731 ' Hs.4774 

128695 RC_D59972_i NMJXB478 Hs.101299 

124028 RC.F04112J F04112 

30 sequence. 

124057 RC.F13604 AA902384 Hs.73853 

134899 RC.H01662 AI609045 Hs.321775 

130973 RC.H05135J AI638418 Hs.78580 

124106 RC_H12245 H12245 

35 124136 RCLH22842 H22842 Hs.101770 

124165 RC.H30894 H30039 Hs.107674 

131229 RC_H43442_s NMJJ15340 Hs2450 

124178 RCJM5996 BE463721 Hs.97101 

129948 RC_H69281J AI537162 Hs.263988 

40 134374 RC.H69485J N22687 Hs.8236 

124254 RC_H69899 H69899 
similar to 

129056 RC_H70627.s AI769958 Hs. 108336 

100919 RC_H73050_S X54534 Hs278994 

45 130724 RC_H73260 AK001507 Hs.306084 

100716 RC_H77531_S X89887 Hs. 172350 

124274 RC_H80552 H80552 Hs.102249 

129078 RC_H80737_s AI351010 Hs.102267 

124828 RC.H93412 AW952124 Hs.13094 

50 124315 RCLH94892.S NM-005402 Hs.288757 

100747 RC_H95643_s X04588 Hs.85844 

124324 RC_H96552 H96552 Hs.159472 

452933 RC.H97146 AW391423 Hs.288555 

132231 RC_H99131_s AA662910 Hs.42635 

55 129170 RC_H99462_s AW250380 Hs.109059 

133143 RC_H99837_S AA094538 Hs.272808 

132963 RCJI22140 AA099693 Hs.34851 

135297 RC_N22197 AL1 18762 Hs.300208 

134347 RC_N23756js AF164142 Hs.82042 

60 130365 RC.N24134 W56119 Hs.155103 

421642 RCJJ24195 AF172Q56 Hs.108346 

439311 RC.N26739 BE270668 Hs.1 51945 

124383 RC_N27Q98 N27098 Hs.102463 

124387 RC.N27637 N27637 Hs.109019 

65 129341 RC.N33090 AJ193519 Hs.226396 

129081 RC_N35957 A1364933 Hs, 168913 

. 1Q2827 RC.N38959J BE244588 Hs.6456 

124433 RC_N39069 AA280319 Hs288840 

124441 RCJM6441 AW450481 Hs.161333 

70 132338 RCJW827DJ AA353868 Hs.182982 

131403 RC_N48365_s AI473114 Hs26455 

124466 RC.N51316 R10084 Hs.113319 

132210 RC.N51499.S NM_007203 Hs.42322 

124483 RCJI53976 AI821780 Hs.179864 

75 124484 RC_N54157 H66118 H&285520 

124485 RCJJ54300 AB040933 Hs.15420 



epilepsy, progressive myoclonus type 2, Lafora disease (laforin) 

ESTs, Weakry similar to CA15_KUMAN COLLAGEN ALPHA 1(V) CHAIN PRECURSOR 

ESTs 

ESTs, Weatty similar to A46010 X-linked retinopathy protein [H .sapiens) 

gb2v37d103l Scares ovary tumor NbHOT Homo sapiens cDNA clone IMAGE755827 3* similar 

ESTs 

SEC10(&cerevlsiae>fike1 

EST, Weakly similar to 138022 hypothetical protein [H sapiens] 

Homo sapiens mRNA; cDMA DKFZp564D036 (from clone DKF2p564O036) 

K1AA0530 protein 

WAA01 33 gene product 

ESTs. Weakly similar to unnamed protein product {Haptens] 
ESTs 

ESTs, Highly simflar to p60 katanin [H .sapiens] 
Homo sapiens cDNA: FU21543 lis, clone COL06171 

gbaeS2fl)1.s1 Stratagene lung carcinoma 937218 Homo sapiens cDNA clone IMAGE551481 3* 

RNA-binding protein gene with multiple splicing 

EST 

ESTs 

ESTs 

ESTs 

thyroid hormone receptor interactor 6 
hypoihefical protein U37E16.5 

Homo sapiens mRNA; cDNA DKFZp761C1712 (from done DKFZp761C1712) 
cuflin5 

gb:HSC2JH052 normalized infant brain cDNA Homo sapiens cONA clone c-2jh06 3\ mRNA 

bone morphogenefic protein 2 

hypothetical protein DKFZp434D1428 

DEAD/H (Asp-Glu-AlfrAsp/His) box polypeptide 1 

gb:ym17a12.r1 Scares infant brain 1NIB Homo sapiens cDNA clone 3", mRNA sequence 

EST 

ESTs 

leucyHRNA synthetase, mitochondrial 
putative G protein-coupled receptor 
ESTs 
ESTs 

gb:yu70c12£l Weizmann Olfactory Epithelium Homo sapiens cDNA done IMAGE239158 3" 

ESTs, WeaWy simflar to ALUEJWMAN !!!! ALU CLASS E WARNING ENTRY 111 [H.sapiens] 

Rhesus blood group, CcEe antigens 

Homo sapiens clone FLB6914 PR01821 mRNA, complete cds 

HIR {histone cell cycle regulation defective, S. cerevisiae) homotog A 

EST 

lysosomal 

presenHlns associated rhomboid-like protein 

v-ral simian leukemia viral oncogene homotog A (ras related) 

neurotrophic tyrosine kinase, receptor, type 1 

Homo sapiens cDNA: FU22224 fis, done HRC01703 

Homo sapiens cONA; RJ22425 fis, done HRC08686 

hypothetical protein DKFZp434K2435 

mitochondrial ribosomal protein L12 

putative transcription regulation nuclear protein; KJAA1689 protein 

epsOon-tubuBn 

Sec23-interacfing protein p125 

solute carrier family 23 (nudeobase transporters), member 1 

eukaryotic translation Initiation factor 1 A, Y chromosome 

retinoic add repressible protein 

mitochondrial ribosomal protein L43 

EST 

ESTs 

hypothetical protein FU 1 11 26 

serine/threonine kinase 24 (Ste20, yeast homotog) 

chaperonin containing TCP1, subunit 2 (beta) 

PR01575 protein 

ESTs 

go|gin-67 

ESTs 

Wnesin heavy chain member 2 
A kinase (PRKA) anchor protein 2 
ESTs 

ESTs, Weakly similar to 21 0926GA B ceD growm factor [H^aplens] 
K1AA1500 protein 
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124494 RCJ*54831 

129200 RCJI53849 

124527 RC.N62132 

124532 RC.N62375 

133213 RCLN63138 

124539 RCJI63172 

133651 RC.N63772 

129196 RC.M63787 

124575 RC.N6816B 

124576 RC.N68201 

124577 RCJ468300 
mRNA 

124578 RCJ468321 
124593 RC.N69575 
128501 RCJT75007 
105691 RCJ175542 
128473 RC.N90066 
128639 RC.N91246 
124652 RC.N92751 
133137 RC_N93214_s 
124671 RC.N99148 
PROTEIN 

133054 RCR07B76 
[C.elegans] 

130410 RC.R10865J 

124720 RCJW056 
similar to 

124722 RCJH1488 

129961 RC_R22947 
repetitive element 128944 

132965 RC.R26589J 

133740 RCR37588J5 

133074 RC.R37613 

124757 RC_R38398 

124762 RC R39179J 

124773 RCJW0923 

135266 RC.R41179 

131375 RC_R41294js 

133753 RCJW2307J 

128540 RC.R43189J 

124785 RC_R43306 

124792 RC.R44357 

124793 RC.R44519 
sequence* 

124799 RCLR45088 
sequence. 

124812 RCJW7948J 

124821 RC_R51524 

127274 RCJ*54950 

124835 RCLR55241 

124845 RC.R59585 

124847 RC.R60044 

440630 RC.R60872 



N54831 Hs.271381 ESTs, Weakly similar to 138022 hypothetical protein [H.sapiens] 

N59849 Hs.13565 Sam68«ffice phosphotyrosine protein, T-STAR 

N79264 Hs.269104 ESTs 

N62375 Hs.102731 EST 

AA903424 Hs.6786 ESTs 

D54120 Hs.146409 ceO division cycle 42 (GTP-binding protein, 25kD) 

AI301740 Hs.173381 dihydropyrrmidinase^ike 2 

BE296313 Hs.265592 ESTs, Weakly similar to I38022 hypothetical protein [H.sapiens] 

N 68 168 gb2a11c0U1 Scares fetal Over spleen 1NFLS Homo sapiens cDNA done 3*. mRNA sequence 

N68201 Hs.269124 ESTs, WeaWy shnflar to I38022 hypothetical protein [H.sapiens] 

N 58300 gb2a12g07.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done IMAGE:292380 3T, 

N68321 Hs.231500 EST 

N69575 Hs.102788 ESTs 

AL133572 Hs.199009 protein containing CXXC domain 2 

A1680737 Hs.289068 Homo sapiens cDNA FU 11918 fe, done HEMBB1 000272 

T78277 Hs. 100293 O-Bnted ri^tylglucosamine (GicNAc) transferase (UDP44-acetyigluccsamine:por/peptjd^ 

AW582962 Hs.102897 CGW7 protein 

W19407 Hs.3862 regulator of nonsense transcripts % DKFZP434D222 protein 

AB002316 Hs.65746 K1AA0318 protein 

AK001357 Hs.102951 Homo sapiens cDNA FU10495 fis, done NT2RP2QQ0297, moderately similar to ZINC FINGER 

AA464836 Hs.291 079 ESTs, Weakly similar to T27173 hyrjothefical protein Y54G1 1 A.9 - Caenorhabditis elegans 

J00077 Hs.155421 alpha-fetoprotein 

R05283 gfcye91c08.s1 Soares fetal Ever spleen 1NFLS Homo sapiens cDNA clone IMAGE:125102 3* 

Hs.185685 ESTs 



T97733 
R23053 
RC_R23930_sAL137586 



AI248173 

AW162919 

AL134275 

H11368 

AA553722 

R45154 

R41179 

AW293165 

NM.004427 

AW297929 

W38537 

R44357 

R44519 

R45088 

R47948 

H87832 

AW966158 

R55241 

R59585 

W07701 

BE561430 



Hs.191460 

Hs.170160 

Hs.6434 

Hs.141055 

Hs.92096 

Hs.106604 

Hs.97393 

Hs.143134 

Hs.165263 

Hs.328317 

Hs.280740 

Hs.48712 



Hs.1 88732 

Hs.7388 

Hs.58582 

Hs.101214 

H&101255 

Hs^04177 

Hs^39388 



gb.7h31a05.r1 Soares placenta Nb2HP Homo sapiens cONA done 5' similar to contains L1 

Hs.52763 araphase-promoting complex subunit 7 

hypothefical protein MGC12936 

RAB2, member RAS oncogene famfly^Bce 

hypothefical protein DKFZp761F2014 

Homo sapiens done 23758 mRNA sequence 

ESTs, Moderately similar to A46010 X-Bnked retinopafoy protein [H.sapiens] 

ESTs 

WAAQ328 protein 
ESTs 

early development regulator 2 (homotog of poryhomeofic 2) 
EST 

hypothefical protein MGC3040 
hypothefical protein FU20736 

gb:yg24h04.s1 Soares Infant brain 1NIB Homo sapiens cONA clone 1MAGE33350 3'. mRNA 
gb.-yg38g04.s1 Soares infant brain 1NIB Homo sapiens cDNA done IMAGERS 3', mRNA 
ESTs 

keJch(Drosophita>fike3 

Homo sapiens cONA FU12789 fis, done NT2RP2001947 

EST 

ESTs 

Homo sapiens done FLB8503 PR02286 mRNA, complete cds 
Human DNA sequence from clone RP1-304B14 on chromosome 6. Contains a gene for a novel 



protein and a part of a gene for a novel protein with two isoforms. Contains ESTs, STSs, GSSs and a CpG island 



124861 RC.R66690 

130141 RC_R672S6_s 

124879 RC.R73588 

124892 RC_R79403 

124906 RO.R87647 

124922 RC_R93S22 

124940 RC_R99599_s 

124941 RC.R99812 
124943 RCJ02888 



R67567 
NrVL004455 
R73588 
AI970003 
H75964 
R93622 
AF068846 
AI765S61 
AW963279 



WARNING ENTRY (H.saptensl 



124947 RQJ03170 
124954 RC.T10465 
132924 RC_T15418_f 
133113 RC.T15597J 
132975 RCJT15652J 
133235 ROJ16898_s 
131082 RC.T26644J 
124980 RC.T40841 
124984 RCJT47566J 
124991 ROJ50116 



TD3170 
AW964237 
U55184 
8E383768 
R43504 
AW960782 
AT091121 
T40841 
BE313210 
T50116 



Hs.107110 
Hs.150956 
Hs.101533 
HSJ23756 
Hs.107815 
Hs.12163 
Hs.103804 
Hs.27774 
Hs.123373 

Hs.100165 

Hs.6728 

Hs.154145 

Hs.65238 

Hs.6181 

Hs.6856 

Hs^46218 

Hs38581 

Hs^23241 



ESTs 



ESTs 



(mulfipleHike 1 



to similar to SP:VE2^LAMBD P03756 EA22 GENE , mRNA sequence. 
129475 RCJ50145_s NMJXM477 Hs^03772 FSHD regton gene 1 



hypothefical protein similar to swine acylneuramhate lyase 
ESTs 

eukaryotic transtaSon inflation factor 2. subunit 2 (beta, 38kO ) 
heterogeneous nuclear ribonudeoprotein U (scaffold attachment factor A) 
ESTs, Highly similar to AF161349 1 HSPC086 [H.sapiens] 

ESTs, WeaJdy simSar to ALU1 JHUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 
ESTs 

WAA1548 protein 
hypothefical protein FU11585 

95 kDa refinobtestoma protein binding protein; K1AA0561 gene product 
ESTs 

ash2 (absent, small, or homeotic, Drosophite, homotog)-flke 
Homo sapiens cDNA: FU21781 fis, done HEP00223 
ESTs 

eukaryotic translation elongation factor 1 delta (guanine nucleotide exchange protein) 
gbyb77c10^1 Stratagene ovary (937217) Homo sapiens cONA done IMAGE77202 3' similar 
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125000 RCJ5B615 
132932 RCJ59940J 
129534 RCJ63595 

125008 RCJ64891 

125009 RCJT54924 
132940 RCJ64933_r 

125017 RCJ68875 
sequence. 

125018 RCJ69027 
125020 RCJ69924 
129891 RCJ70353 
134204 RCJ79780_s 
125050 RCJT79951 
125052 RCJB0174_s 
125054 RCJ8Q622 
125063 RC.T85352 



T58615 

AW1 18826 

AK002126 

T91251 

T64924 

T79136 

T68875 



Hs.110840 

Hs.6093 

Hs.11260 

HsJ03046 
Hs.127243 



ESTs 

Homo sapiens cONA: FU22783 fis, done KAIA1993 
hypothetical protein RJ1 1264 

gb:yd6Qa10.s1 Soares fetal liver spleen 1NFLS Homo sapiens cONA done 3\ mRNA sequence 
ESTs 

Homo sapiens mRNA for K1M1724 protein, partial cds 

gbryc30fD5.s1 Stratagene liver (937224) Homo sapiens cDNA done IMAGE:82209 3', mRNA 



T69027 Hs.57475 sex comb on mkfleg homotog 1 

T69981 gteyc19d03j1 Stratagene lung (937210) Homo sapiens cDNA clone 5\ mRNA sequence 

AI084813 Hs.13197 ESTs 
AI873257 Hs.7994 hypoftefcal proteki FU20551 
AW970209 Hs.1 11805 ESTs 

T85104 Hs.222779 ESTs, Moderately similar to similar to NEDD4 [H^apiens] 
T80822 Hs.268601 ESTs, Weakly similar to envelope [H^apiens] 

T85352 gbryd82dD1 .si Soares fetal Over spleen 1NFLS Homo sapiens cDNA done IMAGE: 1 14721 3* 

similar to contains Alu repefiflve eiementcorrtains L1 repetitive element ;, mRNA sequence. 

125064 RCJB5373 TB5373 gb.yd828J7.s1 Soares fetal Ever spleen 1NFLS Homo sapiens cDNA done IMAGE:1 14757 3 1 

similar to contains Alu repefifive eJementcontans MER3 repefifive element ;, mRNA sequence. 



125056 RCJ86284 T86284 
Alu repetitive dement;, mRNA sequence 

112264 RCJ89579_S AL045364 Hs.79353 

125080 RCJ90360 T90360 Hs.268620 
WARNING ENTRY [^.sapiens] 

125097 RCJ94328J AW576389 Hs.335774 
125104 RCJ95590 T95590 



gfcyd77b07.s1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done J slm2ar to contains 
transcription factor Dp*1 

ESTs, Highly similar to ALU6JHUMAN ALU SUBFAMILY SP SEQUENCE CONTAMINATION 



EST, Moderately similar to S65657 alpha-1C-adrenergic receptor splice form 2 [H.sapiens] 
gbye40a03.s1 Soares fetal fiver spleen 1NFLS Homo sapiens cONA done 3' similar to 
gb|M10817PGURRAA Iguana iguana 5S (rRNA );, mRNA sequence 

135107 RCJ97257J T97257 Hs.337531 ESTs, Moderately similar to I38022 hypothetical protein [Haptens] 
129550 RCJT97599J AA845462 Hs.124024 deitex Prosophila) homotog 1 

125118 RC.T97620 R10606 gb:yf35f11.s1 Soares fetal Over spleen 1NRS Homo sapiens cONA clone IMAGE:128877 2 

simOar to contains Alu repetitive element, mRNA sequence. 



125120 RCJ97775 T97775 

134160 RC.T98152 T98152 

125136. RCJW31479 AW952364 

125144 RCJW37999 AB037742 

125150 RCJW38240 W38240 

104180 RCJV40150 AA247778 

131987 RC.W45435 AW453069 

125178 RC.W58202 W93127 

125180 RCJV58344 W58469 

125182 RC.W58650 AA451755 

130588 RC.W68736 AL030996 

125197 RC_W69106 AF086270 

133497 RC.W69111 BE617303 

100562 RC_W69385_s NNL005185 

125639 RC_W69399_s Z97630 

129232 RCW69459 R98881 

101495 RC.W72424 W72424 

125209 RC W72724 W72724 

125212 RC W72834 AA746225 

129132 RC W73955 BE383436 

125223 RC W74701 AI916269 
WARNING ENTRY [H^apiens) 

125225 RC W76540 W74169 

125228 RC.W79397 AA033982 

132393 RC W85888 AL135094 

125238 RC.W86038 N99713 

125247 RCJV86881 AA694191 

129296 RC.W87804 AI051967 

125263 RCJV88942 AA098878 

125266 RC_W90022 W90Q22 
PRECURSOR [H^apiens] 

131321 RCW92272 U91543 

131601 RCW92764_s NNL007115 

131677 RC.W93040 H05317 

120837 RCJW93Q92 BE149656 

125277 RC.W93227 W93227 

125278 RC W93523 AI218439 
125280 RC W93659 AI123705 
131856 RC W94003_s W93949 
131844 RC_W94401_S AI419294 
125284 RCJV94688 NMJM2666 
313447 RC_W94787_s AW016321 
130799 RCJ38294_s AB028945 
125289 RC.^38311 T34530 
128874 RC_Z38465_s H06245 



Hs.100717 
Hs.79432 
Hs.129051 
Hs.24336 

Hs.1 19155 

Hs.3657 

Hs31845 

Hs.103120 

Hs.263560 

Hs.16411 

Hs.278554 

Hs.74266 

Hs.301512 

Hs^26117 

Hs.109655 

Hs.1 12405 

Hs.1 03174 

Hs.103173 

Hs.108847 

Hs.109057 

Hs.16492 
Hs.1 10059 
Hs.47334 
Hs.109514 
Hs.163914 
Hs.1 10122 

Hs.186809 

Hs.25601 

Hs.29352 

Hs.283549 

Hs306621 

Hs.103245 

Hs.129998 

Hs.106932 

Hs33245 

Hs.324342 

Hs.103253 

Hs.82306 

Hs.12696 

rte.4210 

Hs.1 06801 



EST 

fibrin 2 (congenital contractural arachnodactyly) 
ESTs 

KIAA1321 protein 

Empirically selected from AFFX single pro beset 

Homo sapiens mRNA full length insert cDNA done EUROIMAGE 614975 

active-dependent neuroprotective protein 

ESTs 

ESTs 

ESTs 

hypothetical protein LOC57187 



hypothetical protein MGC4251 

nuclear mitotic apparatus protein 1 

H1 historic family, member 0 

sex comb on mHleg (DrosophilaH^ 1 

$100 catefc/m-b/rafing protein A9 (calgranufln B) 

ESTs, Weakly similar to TSP2_HUMAN THROMBOSPONDIN 2 PRECURSOR [H^apiens] 
ESTs 

hypofhefical protein MGC2749 

ESTs, Weakly simBar to ALUS JHUMAN ALU SUBFAMILY SC SEQUENCE CONTAMINATION 
DKFZP564G2022 protein 

ESTs, WeaWy simBar to I38022 hypotheticaJ protein IH^apiens] 

hypoflietical protein FU14495 

ESTs 

ESTs 

ESTs 

gbzn45g10j1 Stratagene HeLa ceO s3 937216 Homo sapiens cDNA done 5\ mRNA sequence 
ESTs, Highly similar to LCT^HUMAN LEUKOCYTE CELL-OERIVED CHEMOTAXBsf 2 

chromodomain hefcase DNA binding protein 3 
tumor necrosis factor, alpha-induced protein 6 
ESTs 

Homo sapiens cONA FU1 1963 Ms, done HEMBB1001051 
EST 

enhancer of poiycomb 1 
ESTs 

ESTs . 

ESTs 

periEptn 

destrin (acfin depoJymerizing factor) 

cortacfln SK3 GOtrsm-binding protein 

Homo sapiens cONA FU13069 Ms, done NT2RP3001752 

ESTs, Weakly similar to PC4259 ferritin associated protein [H.sapiens] 
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130966 RCZ38525.S 

128875 RC Z38538J 

133200 RC_Z38551_s 

130158 RCJ38783_s 

5 125295 RC.Z39113 
domain, (semaphorin) 4F 

125298 RCLZ39255J 

125300 RCLZ39591 

323122 RC_Z39783_s 

10 311463 RCJ39920 

130882 RCLZ40166J 

128888 RC_ J Z40388_s 

125310 RC_Z4G646 

125315 RC.Z41697 

15 125317 RC_Z99349 

135096 RC_Z99394_S 

104786 RC.AA027168 

132837 O58024.S 

120456 RCLAA251113 

20 132459 RC.M347573 

101545 M31210 

133505 C01527 

132360 RCJJ62948_s 

132738 RC W42674 

25 119586 RC_W43000_s 

129914 RC_N31750_s 

130839 AF009301 

132813 L37347 

134342 M99564 

30 131878 RCJ\A43Q673 

105426 RCJW251297 

132968 RCJW620722 

132173 RCJV46577_s 

113932 RCLW81237 

35 114452 RCJ\A020825 
PROTEIN TC10 

115243 RCJ\A278765 

134403 RC_H93708_s 

129647 RC.N49394 

40 111428 RC_H56559__s 

115967 RCLAA446887 

120726 RC.AA293656 

114995 RCUA251152 

303876 RCJW233334.S 

45 dominant, ataxin 3) 

311463 RCLZ39920 

120302 RCJW192173 

133071 RC_AA455044 

121032 RC.AA398504 

50 129829 U41813 

120245 RC_AA166965 

120985 RCJW398222 

114184 RCJZ39095 

447503 RC_AA2B4744J 

55 cds 

132837 RC_AA428201 

121034 RCLAA398507 

119718 RC.W69216 

120455 RCLAA251083 

60 125280 RC.W93659 

132155 RC.AA227903 

120609 RCJW283902 

121278 RCJWW1631 

109023 RCJVA157293 

65 129815 RCJ360208J 

108061 RCJW043979 

113287 RCJ66847 

114082 RCLZ38239 

116334 RCJW491457 

70 131486 RC_Z4007Ls 

107860 RCLAAQ24961 

131263 RCJM43826 

132207 RCJW443294 

129183 RC.M155743 

75 408431 RC.T23708 

120575 RCJW280934 



AW971018 Hs.21659 ESTs 

AB040923 Hs.106808 keteh (Drosophfla>Gke 1 

AB037715 Hs.183639 hypomeficaJ protein FU10210 

A6032947 Hs.151301 Ca2-Kfepandent adivator protein for secretion 

ABO 223 17 Ks.25887 sema domain, immunoglobulin domain (ig), transmembrane domain (TM) and short cytoplasmic 

AW972542 Hs.289008 Homo sapiens cDNA: RJ21814 fis, done HEP01068 

Z39591 Hs.101376 EST 

BE622770 HsJ264915 Homo sapiens cDNA FU12K)8 fis. done NT2RP2004399 

R55344 Hs.22142 cytochrome b5 reductase b5R2 

AA497044 Hs^0887 hypomefcal protein FU10392 

AI76Q853 Hs.241558 aiiadne (Drosophfla) homolog 2 

R59161 Hs.1 24953 ESTs 

R38110 Hs.10686 ESTs 

Z99348 Hs.112461 ESTs, WeaWy SHTiBar to I38022 hypothetical protein [H.saplens] 

AA081258 Hs.132390 zinc finger protein 36 (KOX 18) 

AA027167 Hs. 10031 K1AA0955 protein 

AA370362 Hs. 57958 EGF-TW^tropMin-re!ated protein 

AA488750 Hs.88414 BTB and CNC homology 1 , basic leucine zipper transcription factor 2 

AL 120071 Hs.48998 Obronecfin leucine rich transmembrane protein 2 

BE246154 Hs.1 54210 endothelial differentiation, sphingoltpid G-protem-ooupled receptor, 1 

AIS30124 Hs.324504 Homo sapiens mRNA; cDMA DKFZp586J0720 {from done DKFZp5B6J0720) 

AW893660 Hs.46440 solutB canter family 21 (organic anion transporter), member 3 

AK000738 Hs.264636 hypothetical protein RJ20731 

AF088033 Hs.159225 ESTs 

NM_012421 Hs.13321 rearranged L-myc fusion sequence 

AB011169 Hs.20141 similar to S. cerevistae SSM4 

BE31 3625 Hs.57435 solute carrier family 1 1 (proton-coupled divalent metal ion transporters), member 2 

NNL000275 Hs.82027 cojlocutaneous albinism II (pink-eye dMon (murine) homolog) 

AA083764 Hs.6101 hypothetical protein MGC3178 

W20027 Hs.23439 ESTs 

AF234532 Hs. 61 638 myosin X 

X89426 Hs.41716 endothelial cell-specific molecule 1 

AA256444 Hs.1 26485 hypothetical protein FU12604; K1AA1692 protein 

AI369275 Hs.243010 Homo sapiens cDNA HJ14445 fis, done HEMBB1001294, highly similar to GTP-BIMDIMG 

AA806600 Hs.116665 K1AA1842 protein 

AA334551 Hs.82767 sperm specific antigen 2 

AB018259 Hs.1 18140 KIAA0716 gene pradud 

AL031428 Hs.174174 WAA0601 protein 

AI745379 Hs.42911 ESTs 

AA293655 Hs.97293 ESTs 

AA769266 Hs.193657 ESTs 

U64820 Hs. 66521 MachadcKJoseph disease (spinocerebellar ataxia 3, olivopontocerebellar ataxia 3, autosomal 

R55344 Hs.22142 cytochrome b5 reductase b5R.2 

AA837098 Hs.269933 ESTs 

BE384932 Hs.64313 ESTs, Weakly similar to AF257182 1 G-protekvcoupled receptor 48 [H^apiens] 

AA393037 Hs.161798 ESTs 

AF010258 Hs.127428 homeoboxA9 

AW959615 Hs.111045 ESTs 

AI219B96 Hs.97592 ESTs 

R56434 Hs.21062 ESTs 

AA115496 Hs.336898 Homo sapiens, Similar to RIKEN cDNA 1810038N03 gene, done MGCS890, mRNA. complete 

AA370362 Hs.57958 EGF-TM7-btrophflin-related protein 

AL389951 Hs.271623 nudeoporin 50kD 

W69216 Hs.92848 ESTs 

AA251720 Hs.104347 ESTs, Weakly similar to ALUC.HUMAN 111! ALU CLASS C WARNING ENTRY 111 [H^apiens] 

AI123705 Hs.106932 ESTs 

AK001607 Hs.41127 hypothefical protein FU13220 

AW978721 Hs266076 ESTs, Weakly sfmQar to A46010 X-iinked retinopathy protein [Haptens] 

AA037121 Hs.98518 Homo sapiens cDNA FU11490 fis, clone HEMBA1001918 

AA157293 Hs.72168 ESTs 

BE565817 Hs.26498 hypothetical protein FU21657 

AA043979 Hs.62651 EST 

T66847 Hs.194040 ESTs, Weakly similar to I38Q22 hypothetical protein [H .sapiens] 

AK001612 Hs.26962 Homo sapiens cDNA FU10750 fe, clone KT2RP30D1929 

AL038450 Hs.48948 ESTs 

F06972 Hs .27372 BMX non-receptor tyrosine kinase 

AA024961 Hs.50730 ESTs 

AU077002 H&24950 regulator of G-proteln signalling 5 

BE206939 Hs.42287 E2F transcription factor 6 

BE561824 Hs^73369 unctaraderized hematopoietic stenVprogenitor cells protein MDSQ27 

A1338631 Hs.43268 Homo sapiens cDNA: FU22536 fis, done HRC13155 

AW978022 Hs.238911 hypomefical protein DKFZp762E1511; K1AA1816 protein 
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132121 
117657 
134922 
118523 
5 116845 
115291 
120326 
130174 
129131 
10 129868 
118661 
1298^ 
115985 
'134637 
132714 
129771 
123360 
132902 
113716 
113825 
130367 
120541 
116727 
118219 
119767 
128917 
451553 
132716 
118525 
114616 
119743 
108154 
122798 
133746 
119822 
122186 
114941 
118053 
123234 
129280 
118995 
116750 
129026 
105127 
114513 
411856 
132036 
130091 
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RCJ\A443284_s 

RC.N39074 

RC_WD4507_s 

RC_R4182B_s 

RCH64973 

RCLAA279943 

RCJ\A19S300 

M29550 

RCJW436489 

RC.AA287032 

RCLN70777 

RCJU496921 

RC_AA447709 

RCJ^A369856_s 

RC J AA252598 

RCLH73237 

RCJ^A504784 

RC.AA490969 

RCJ97750 

RC.W48860 

RC.Z38501 

RC.M278298 

RC_F13S84 

RCJI62231 

RCJV72562 

RCAA481252 

RC_AA020928 

RCJ\A251288 

RC.N67861 

RCj*A084162 

RCJV70242 

RC_AA425151_s 

RC_AA460324 

U44378 

RCJV74471 

RCLAM35842 

RCJ\A243017 

RC.N53367 

RC.AA490227 

M63154 

RC.N94591 

RC_H05960 

M98833 

RC_AA158132 

RC_AA044825 

RCJT35697 

W01568 

RC.W88999 



NM_004529 

N39074 

AI718295 

Y07759 

AA649530 

BE545072 

AA196300 

M29551 

AB026436 

AW172431 

AL137554 

AF010258 

AA447709 

U87309 

W39388 

AL096748 

AA532718 

AI936442 

AA001356 

AW014486 

AL135301 

W07318 

R76472 

AA862391 

W72562 

AI365215 

AA018454 

BE379595 

N67861 

AW979261 

AA947552 

NM.005754 

AW366286 

AW410035 

AF086409 

AA398811 

AA236512 

N53391 

NM.001938 

M63154 

N94591 

AA760689 

AL120297 

AA045648 

AA044873 

H67899 

AL157433 

W88999 



Hs.404 ' 
Hs.44933 
Hs.91161 
Hs.170157 

Hs.122579 

Ks.21145 

Hs.151531 

Hs.177534 

Hs.13012 

Hs.49927 

Hs.127428 

Hs.268115 

Hs.180941 

Hs.55336 

Hs.102708 

Hs.178604 

Hs^9838 

Hs.18159 

Hs.22509 

Hs.8768 

Hs.240 

Hs.65646 

Hs.48494 

Hs£8119 

HSJ206097 

Hs.269211 

Hs.283738 

Hs.49390 

Hs.291993 

Hs.58086 

HS220689 

Hs.145696 

Hs.75862 

Hs.301327 

Hs.104673 

Hs.87331 

Hs.47629 

Hs,16697 

Hs.110014 

Hs.323056 

Hs.92418 

Hs.108043 

Hs.301957 

Hs.103446 

Hs.4190 

Hs.37706 



sequence 

50 414108 U09564 AI267592 

119881 RC_W81456 W81486 

117770 RCJ447953 AW957372 

119850 RC.W80447 A1247568 

115439 RCJVA284561 AI567972 

55 123107 RCUA486071 AA225048 

406698 M24364 X03068 

121231 RCJW100780 AA814948 

132074 AB002366 AA478486 

413670 AB000115 AB000115 

60 125277 RC_W93227 W93227 

114056 RCJ^A186324 AA188175 

121153 RC.M399640 AA399640 

121609 RCJK416867 AA416867 

120661 RC.M287556 AA287556 

65 120850 RC_AA349647 AA349647 

124947 RC.T03170 T03170 

130529 RC_AA280886 AA178953 
repefiSve element, mRNA sequence 

117683 RCJJ4O180 N40180 

70 IMAGE276387 3' similar to contains L1.t1 L1 

12)745 RC^AA302809 AA302809 

120936 RC JU385934 AA385934 

112597 RC.R78376 R78376 

12)183 RC^40174 AW082866 

75 12)844 RCJU287038 AI869129 



Hs.75761 
Hs.58646 
Hs.46791 
Hs.58452 
Hs.193090 
Hs.104207 
HS73931 
Hs.96343 
Hs.3852 
Hs.75470 
Hs.103245 
Hs.82506 
Hs.97694 
Hs.98185 
Hs.263412 
Hs.96927 
Hs.100165 



rnyelcM/lymphoid or mixed-fineage leukemia (trtthorax (Drosophila) homotog); translocated to, 3 

ESTs 

prefokfm 4 

myosn VA (heavy polypeptide 12. myoxin) 

gb:ns44fD5.s1 NC!_CGAPjW1 Homo sapiens cDMA done, mRNA sequence 
hypothefical protein FLi 10461 
hypothetical protein RG083M05.2 

protein phosphatase 3 (formerly 2B), catalytic subunrt, beta isoform (cakaneurin A beta) 

dual specificity phosphatase 1 0 

ESTs 

protein kinase NYD-SP15 
homeoboxA9 

ESTs, WeaMy similar to T08599 probable transcripfion factor CA1 50 (RsapisnsJ 

vacuolar protein sorting 41 (yeast homotog) 

Homo sapiens, clone MGC: 17421, mRNA, complete cos 

DKFZP434A043 protein 

ESTs 

hypothetical protein RJ 1 0808 

ESTs 

ESTs 

hypothetical protein FU10849 
1 



Hs.97184 
Hs29733 
Hs.65882 
Hs.96616 



ESTs 

ESTs, Moderately similar to A46010 X-finked retinopathy protein [Ksapiens] 
ESTs 

oncogene TC21 
ESTs 

casein kinase 1, afoha 1 

ESTs 

ESTs 

ESTs 

Ras-GTPase-acfivafing protein SH3^rnaJn-binding protein 
splicing factor (CC1.3) 

MAD (mothers against decapentaptegte, Drosophila) homolog 4 

ESTs 

ESTs 

ESTs 

ESTs 

down-regulator of transcription 1, TBP-binding (negafive cofector 2) 

gastric intrinsic factor (vitamin B synthesis) 

ESTs 

ESTs 

Friend leukemia virus integration 1 

nudix (nucleoside diphosphate linked moiety X)-type motif 5 

ESTs 

Homo sapiens cDNA: RJ23269 fis, clone COL09533 
hypometical protein DKFZp434E2220 

gb2h7Qh03.s1 Soares.fetaLBver-.spleenJNFLS.S1 Homo sapiens cDNA done 3', mRNA 

SFRS protein kinase 1 
ESTs 

ESTs, Weakly simSar to 138022 hypothetical protein [H .sapiens] 
ESTs 

ESTs, Hightysimilarto AF161437 1 HSPC319 [risaptens] 
ESTs 

major histocompatibility complex, class II, DQ beta 1 

ESTs. Weakly similar to ALUC_HUMAN Hil ALU CLASS C WARNING ENTRY H! [H^apiens] 
KIAA0368 protein 

hypothetical protein, expressed in osteoblast 
EST 

K1AA1254 protein 

ESTs 

EST 

ESTs, WeaMy similar to ALUB.HUMAN fit! ALU CLASS B WARNING ENTRY I!! [H^apiens] 

Homo sapiens cDNA RJ12573 fc, done NT2RM4000979 

ESTs 

gb:zp39e03.s1 Stratagene musde 937209 Homo sapiens cDNA done 3* similar to contains Abj 

gb:yy44d02.s1 Soares_muJtip!e.sderosfe - 2NbHMSP Homo sapiens cDNA done 
element ;, mRNA sequence. 
gb£ST10426 Adipose tissue, white I Homo sapiens cDNA 3* end, mRNA sequence, 
EST, r^hly similar to (defllne not available 7499603) [Cefegans] 
EST 
ESTs 
ESTs 
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119023 RC.N98488 N98488 
IMAGE310129 3*, mRNA sequence. 

107582 RCJVA002147 AA002147 

118249 RCJI62580 N62580 

115022 RCLAA252029 AA252Q29 

117710 RCJI45198 N45198 

115341 RC.AA281452 AA281452 

118896 RC.N9QS80 N46213 

121121 RC.M399371 AA399371 

118329 RCN63520 N6352Q 
3*. mRNA sequence. 

119496 RC.W35416 W35416 

118111 RCJJ55493 N55493 
mRNA sequence. 

119062 RC.R16698 AW444881 

116710 RC.F10577J F10577 

119261 RC.T15S56 T15956 

122723 RC..AA457380 AA457380 

similar to contains Ll.b3 L1 repetitive element 

117732 RCJ146452 N46452 



gbzb82h01.s1 ScaiesjsenescenLfibrob!asts_Nl)HSF Homo sapiens cDNA done 
Hs^9952 EST 

Hs.322925 EST, WeaWy similar to putative p150 [H-sapfens] 
Hs.87935 ESTs 

Hs.47248 ESTs, Highly similar to similar to Cdc14B1 phosphatase [H.sapiens] 
Hs.88840 EST, Weakly similar to granule cell marker protein [Rmuscutus] 
Hs.54642 rnefhbnine adenosyttransferase II, beta 
Hs.1 89095 similar to SALL1 (sai (DrosophiiaHike 

gb:yy62ff)1.s1 Soaresjralfiple.scterosis.2NbHMSP Homo sapiens cDNA clone IMAGE278137 

Hs.156861 ESTs, Moderately sMarto A46010 X-Cnked retinopathy protein [H.sapiens] 

gb;yv5Qc02j1 Soares fetal liver spleen 1NFLS Homo sapiens cDNA done IMAGE:246146 3', 

ESTs 



Hs.77829 
Hs.306088 
Hs.65289 



v-crk avian sarcoma virus CT10 oncogene homotog 
EST 

gbaa86b10.s1 Stratagene fey retina 937202 Homo sapiens cDNA clone 1MAGE:838171 3' 
mRNA sequence. 

gfcyy76h09.s1 SoaresjTUjffipte_sderosis._2NbHMSP Homo sapiens cDNA clone 



IMAGE279521 3* similar to contains L1.t2 L1 lepetitive element mRNA sequence. 
104787 RC_AA027317 AA027317 gteze97d11.s1 Scares JeteLhearLNbHH19W Homo sapiens cDNA done IMAGE:366933 3* 

similar to contains Ahi repetitive element;, mRNA sequence. 



100071 A28102 
115819 RC_AA426573 
130882 RC.Z40166J 
125225 RCJV76540 
108339 RCJW70801 



A28102 
AA486620 
AA497044 
W74169 
AW151340 



WARNING ENTRY [H^apiens] 



100338 D63483 
121638 RCJW417027 
103875 RCJW18387 
118716 RC_N73460 
119763 RCJV72450 
121917 RCJU428218 
132806 M91488 
130949 Y10659 
108806 RCJW29933 
133276 RCJUW90478 
134760 RC H16758 
132867 AA121287 
132051 AA091284 
1142)8 RC^Z39301 
104094 AA418187 
128718 AA426361 
302032 RC_N20407 
115501 RCJ\A291553 
101997 U01160 
103708 AA037206 
101899 S59184 
115839 RCJW429038 
409459 D50678 
103563 Z22534 
123233 RC.AA490225 



086864 
AA379203 
T26379 
AI658908 
R54146 
AA406397 
AI699432 
AV656840 
AF070578 
AW978439 
NM_000121 
AF226667 
AA393968 
AL049466 
AA418187 
NNL002959 
NM_001992 
AA291553 
AU076536 
AA430591 
S59184 
BE300266 
D86407 
L02911 
AW974175 

AA402468 
AA159181 
H94227 



55 fHsapiens] 

121305 RC.AA40246B 

114798 RC.AA159181 

133145 RCJW96549 

131567 RC^AA291015.s AF015592 

60 112300 RC.R54554 H24334 

13507 RC.AA192099 AJ236885 

121033 RCJW398505 AA398505 

121151 RCJK399636 AA399636 

121402 RCJW406063 AA406063 

65 123203 RC.AA489571 AA352335 

132271 RCJU236466 AB030034 

125197 RCJV691Q6 AF086270 

114935 RCAA242B09 H23329 
WARNING ENTRY [Rsapiens] 

70 125279 RCJ/V93640 AW401809 

108778 RC _AA128548 AF133123 

108087 RC _M045709 AA045703 

132466 RC.N66810 s AI597655 

133328 R36553 AW452738 

75 124057 RC.F13604 AA902384 

124800 RCJW5115 AW854086 



Human GABAa receptor alpha-3 subunit 

Hs.41135 endomudn-2 

Hs.20887 hypothetical protein FU10392 

Hs.16492 DKFZP564G2022 protein 

Hs.51615 ESTs, Weakly simfar to ALU7JIUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION 

Hs.57735 acetyl LDL receptor; SREC 

Hs.306654 Homo sapiens cDNA FU13574 fe, done PLACE1008625 

Hs.48802 Homo sapiens done 23632 mRNA sequence 

Hs.1 18722 fucosyKransferase 8 (alpha (1 ,6) facosyltransferase) 

Hs.10450 Homo sapiens cDNA: FU22063 fis, done HEP10326 

Hs.98038 ESTs 

Hs.278619 hypofrefcal protein FU10099 

Hs.2851 15 interleukin 13 receptor, alpha 1 

Hs.71168 Homo sapiens clone 24674 mRNA sequence 

Hs.69504 ESTs 

Hs.89548 erythropoietin receptor 

Hs.58553 CTP synthase II 

Hs.180145 HSPC030 protein 

Hs.7859 ESTs 

Hs.330515 ESTs 

Hs.281708 sorfiTml 

Hs.128087 coagulation {actor II (thrombin) receptor 

Hs.190086 ESTs 

Hs.50984 sarcoma ampfiGed sequence 

Hs.72071 hypothetteal protein FU2(X538 

Hs.79350 RYK receptor-lfte tyrosine kinase 

Hs.28935 transducUe enhancer of spot 1 , homotog of Drosophila E(sp1 ) 

Hs.54481 low density fipoproteln receptor-related protein 8, apolipoprotein e receptor 

Hs.150402 Acfivfn A receptor, type I (ACVR1) (ALK-2) 

Hs.188751 ESTs, Weakly state to MAPBJHUMAN MICROTUBULE^SSOCIATED PROTEIN 1B 

Hs.291557 ESTs 

Hs.54900 serologically defined colon cancer antigen 1 

Hs.6592 Homo sapiens, clone {MAGB2961368, mRNA, parte] cds 

Hs.28853 COC7 (ceB division cyde 7, S. cerevisiae, homofog)-Gke 1 

Hs.26125 ESTs 

Hs.112180 zinc finger protein 148 (pHZ-52) 

Hs.97360 ESTs 

Hs.143629 ESTs 

Hs.98003 ESTs 

Hs.65641 hypothetical protein FU20073 

Hs.1 15175 stenTe-aipha motif and leucine zipper containing kinase AZK 

Hs 278554 heterochrajnaSn-Eke protein 1 

Hs.290880 ESTs, WeaWy similar to ALU1 JiUMAN ALU SUBFAMILY J SEQUENCE CONTAMINATION 

Hs.4779 KIAA1150 protein 

Hs.90847 genera) transcription factor IIIC, polypeptide 3 (102kD) 

Hs.40545 ESTs 

Hs.49265 ESTs 

Hs.265327 hypothetical protein DKF2p761l141 

Hs.73853 bone morphogenetic protein 2 

Hs.138617 thyray hormone receptor interacted 12 
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121029 RC_AA398482 AA398482 Hs37641 

120663 RC.jAA287627 AAB27798 Hs.1 05089 

102133 U15173 AU076845 Hs.1 55596 

108246 RC.M062855 AW23132 Hs.146343 

5 125226 RC_W78134 AA782538 Hs.1 22647 

120260 RCJW171739 AK000061 Hs.101590 

124906 RC_R87647 H75954 Hs.107815 

109406 RCJVA226877 AA199883 Hs.67624 

109271 RCJW95668 AW137422 Hs.86022 

10 125052 RCJT80174_S TB5104 Hs.222779 

109101 RC_AA1 67708 AW608930 Hs.52184 

115241 RC^AA278723 AA648278 Hs.193859 

117163 RCJ497909 N38861 Hs.42344 

113530 RCJ90313 T90313 Hs.16732 

15 120375 RCJVA227260 AF028706 Hs.1 11227 

129435 AA314256 AF151852 Hs.111449 

114864 RCJ\A235256 AA135332 Hs.71608 

103988 AA314389 AA314389 Hs.42500 

131006 RCJ\A242763 AF064104 Hs.22116 

20 106781 RCJW478474 AA330310 Hs.24181 

106141 RCJVA424558 AF031463 Hs.9302 

116213 RCJVA476738 AA292105 Hs.326740 

135266 AB002326 R41179 Hs.97393 

135058 RCJ^A430152 A1379720 Hs.93814 

25 119908 RC W85844 AA524470 Hs.58753 

103695 AA018758 AW207152 Hs.186600 

103978 AA307443 NNL016940 Hs.34136 

109485 RCJU233472 BE619092 Hs.28465 

129574 AA458603 AA026815 Hs.1 1463 

30 115347 RC_AA281528 AA356792 Hs.334824 

120765 RCLAA338735 AW961026 Hs.96752 
WARNING ENTRY |H.sapiens] 

121059 RCJ\A398628 AA393283 

131887 AA046548 • W17064 Hs.332848 

35 member 1 

112064 RC_R43812 AL04939O Hs.22689 

115606 RCJW400465 AI025829 Hs.86320 

131750 RC H94855_s NH.004349 Hs.31551 

102123 U1451B NM.001809 Hs.1594 

40 129847 RC W46767 N64025 Hs.296178 

133809 RC AA235275 AV649326 Hs.76359 

132210 RC_N51499_s NM.007203 Hs.42322 

122356 RC.AA443794 AA443794 Hs.98390 

114958 RC J\A243708 N20912 Hs.42369 

45 103951 AA287840 AL353944 Hs.50115 

134703 RCJVA280704 AF117065 Hs.88764 
128727 AA287864 A1223335 Hs.50651 
105743 RC_AA293300_s BE246502 Hs.9598 
domain, (semaphorin) 4B 

50 103744 AA076003 AA079267 
sequence 

114348 N80402 AL050321 Hs.301532 

114009 RC_W900&7 AI248544 Hs.103000 

134704 RC.AA280849 AA837124 Hs.88780 
55 128629 AA399187 AUQ96748 Hs.102708 

104410 H65925 AI807519 Hs.104520 

110200 RC.H21075 H21075 Hs.31802 

124483 RC.N53976 AI821780 Hs.179864 

101391 M14648 NM.002210 Hs.295726 

60 109657 RC.F04826 R60900 Hs.26814 

117140 RC.H96813 H96813 Hs.42241 

132937 RC^AA233706J AW952912 Hs300383 

129799 R36410 AW967473 Hs.239114 

105077 RC..AA142919 W55946 Hs^34B63 

65 100850 RC.N5856LS AA836472 Hs.297939 

131043 RC.AA490925 AF084535 Hs22464 

118417 RC_N66048J AF080229 

129254 RC_jAA243695 AA252468 Hs.1098 

119149 RC_R58910 BE304701 Hs.65732 

70 133996 AA091367 AA380267 Hs.78277 

110223 RC.H23747 H19836 H&31697 

117626 RCJ436090 AKD01757 Hs.281348 

135286 RC J\A424469_s AW023482 Hs.97849 

122967 RCJU478521 AA806187 Hs.289101 

75 131236 AA282640 AF043117 Hs.24594 

128568 AA463380 H 1291 2 Hs.274691 



EST 
ESTs 

BCL2/adenovirus E1B 19kWnteradirg protein 2 
ESTs 

N^yristoyttransferase 2 

hypofhetjca] protein 

ESTs 

ESTs 

ESTs 

ESTs, Moderately simBar to similar to NEDD-4 [H .sapiens] 

hypothetical protein RJ20618 

ESTs 

ESTs 

ESTs 

Zic family member 3 (odd-paired Drosoprtfla homotog, heterotaxy 1 ) 

CGI-94 protein 

ESTs 

ADP-ribosyteiion factor-like 5 

CDC14 (ceil dlvisjon cycle 14, S. cerevfeiae) homotog B 

ESTs 

phosducin-fike 

hypomefjca) protein MGC10947 

WAA0328 protein 

hypothetical protein 

ESTs 

ESTs 

chromosome 21 open reading frame 6 

Homo sapiens cDNA: FU21869 fis, done HEP02442 

UMP-CMP kinase 

hypothetical protein FU14825 

ESTs, Weakly similar to ALU8JWMAN ALU SUBFAMILY SX SEQUENCE CONTAMINATION 

gbzf74e03.r1 Soares Jesfis.NHT Homo sapiens cDNA clone 5\ mRNA sequence 
SWl/SNF related, matrix associated, actin dependent regulator of chromafin, subfamily e, 

Homo sapiens mRNA; cDMA DKFZp58601318 (from clone DKFZp58601318) 
ESTs 

core-binding factor, runt domain, alpha subunit 2; translocated to, 1; cyciin D-retated 
centromere protein A (17kD) 
hypothetical protein FU22637 
catalase 

A kinase (PRKA) anchor protein 2 

ESTs 

ESTs 

Homo sapiens mRNA; cDNA DKFZp761 J11 12 (from clone DKFZp761J1 1 12) 
male-specific lethal-3 (DrosophBaHike 1 
Janus kinase 1 (a protein tyrosine kinase) 

sema domain, immunoglobulin domain (lg), transmembrane domain (TNI) and short cytoplasmic 

gfozm97e10.s1 Stratagene colon HT29 (937221) Homo sapiens cDNA done 3*. mRNA 

CRP2 binding protein 
K1AA0831 protein 
ESTs 

DKF2P434A043 protein 

Homo sapiens cDNA FU13694 fis, clone PLACE2000115 

ESTs, Highly similar to A59266 unconvenfional myosin-1 5 (H.sapiens) 

ESTs 

Integrin, alpha V (vitronectin receptor, alpha polypeptide, antigen CD51) 

ESTs 

ESTs 

hypothetical protein MGC3032 

maiirwsia^aipfia,dass1A,mernber2 

Homo sapiens cDNA FU12082 fis, clone HEMBB1002492 

cathepsin B 

epilepsy, progressive myoclonus type 2, Lafora disease (laforin) 

gb:Human endogenous retrovirus K clone 10.1 polymerase mRNA, partial ods 

DKFZp434J1813 protein 

ESTs 

DKFZP434F2021 protein 
ESTs 

hypomeficaJ protein FU10895 
ESTs 

glucose regulated protein, 58kD 

ubiquifinatjon factor E4B (homobgous to yeast UFD2) 

adenylate kinase 3 
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55 



60 



65 



70 



75 



12888 RC T03872 

15192 RCJU261920 

18688 RC.N71484 

22284 RCJM36837 

28981 AA135452 

31042 RC.R42457 

03704 AA028171 

21341 AA233107 

08593 RC AA455826 

15195 RCLAA262156 

15425 RCJW284071 

17258 RC N21299 

20209 RC.Z40892 

sequence 

34082 L16991 

04774 RC AA026055 

15625 RC_AA401630 

04469 N28707 

07401 W20054 

11686 RC R21510 

15300 RCJ\A280026 

15378 RC - AA282292 

32224 RC H97819 

13791 M95767 

29144 AA004987 

04448 L44574 

32084 RC T2698LS 

11831 RC_R36083 

4765 RCJW252163 

5029 RC.AA252219 

00457 H81492 

04536 R24011 

PROTEIN 91 

16167 RC_jAA461562 

03889 AA236771 

31978 RCJW8459js 

18843 RCJ1B0181 

20837 RC W93092 

33647 D21852 

29521 U41815 

I03746 AA081876 

sequence 

132019 RCJW134965J 

32310 RCJ\A284107 

17367 RC_N24954 

03743 AA075998 



AW195317 Hs.107716 hypothetical protein FU22344 

AA741024 Hs.88378 ESTs 

AK000708 Hs.169764 hypothetical protein FU20701 

AA436837 gtezv57gD7.s1 SoaresJestis_NHT Homo sapiens cDNA done 3*, mRMA sequence 

AA927177 Hs.86041 CGG triplet repeat binding protein 1 

A1826288 Hs.171637 hypothetical protein MGC2628 

AA028171 Hs.151258 hypothetical protein FU21 062 

AF035528 Hs.153863 MAD (mothers against decapentaptegte, Drosophila) homolog 6 

AW296451 Hs.24605 ESTs 

AW968619 Hs.155849 ESTs 

AA811895 Hs.180660 ESTs, Wealdy similar to 154374 gene NF2 protein [H^apiens) 

AF086041 Hs.42975 ESTs 

F02951 gb:HSC1HB082 normalized infant brain cDNA Homo sapiens cONA clone c-1hb08 3*. mRNA 

L16991 Hs.79006 deoxythymidylate kinase (thymidylate kinase) 

AW959755 Hs.288895 Homo sapiens cDNA RJ12977 fe, done MT2RP2006261 

AA059459 Hs.62592 ESTs 

N28707 Hs.154304 Homo sapiens chromosome 19, BAC 282485 (CFT-B344H19) 

N91453 Hs.102987 ESTs 

R22039 Hs23217 ESTs 

AA280095 Hs.88689 ESTs 

AA282292 Hs.279841 hypometical protein FU10335 

N41549 Hs.285410 ESTs 

AJ269096 Hs.135578 chitobiase, dRteetyl- 

AL137275 Hsi0137 hypothetical protein DKFZp434P0116 

NM.007331 Hs.1 10457 WolWitrschhom syndrome candidate 1 

NNL002267 Hs.3886 karyopherin alpha 3 fimportin alpha 4) 

R36095 H&268695 ESTs 

AA463550 Hs.337532 ESTs, Weakly similar to A47582 B-ceD growth factor precursor fjisapiens] 

AL137939 Hs.40096 ESTs 

BE246400 Hs.285176 acetyl-Coenzyme A transporter 

R24024 Hs.158101 Homo sapiens cDNA RJ14673 fe, done NT2RP2003714, moderately similar to ZINC FINGER 

AI091731 Hs.87293 hypotheticaJ protein FU20045 

R85350 Hs.101368 ESTs 

AA355925 Hs36232 WAAD186 gene product 

N80181 Hs.221498 ESTs 

BE149656 Hs.306621 Homo sapiens cDNA FU11963 fe, done HEMBB1001051 

NM.015361 Hs.268053 K1AA0029 protein 

AF071076 Hs.112255 nudeoporin 98kD 

AA075000 gb:zmS3c07.s1 Stratagene ovarian cancer (937219) Homo sapiens cDNA done 3\ mRNA 

H56995 Hs.37372 Homo sapiens DNA binding peptide mRNA, partial cds 

AA173223 Hs.289044 Homo sapiens cONA RJ12048 fe, clone HEMBB1001990 

AI041793 Hs.42502 ESTs 

AA075998 gbam89b09 j1 Stratagene ovarian cancer (93721 9) Homo sapiens cDNA clone 5' similar to 



gb:M15887 ACYL-COA-BINDING PROTEIN (HUMAN);, mRNA sequence 



03761 AA085138 AA765163 
BINDING PROTEIN (HUMAN);, mRNA sequence 



30237 L39060 

28752 RC.N72879 

35162 AA045930 

31386 AA096412 

29021 RO/A599244 

24274 AA293634 

29913 H06583 

31888 U79298 
'3803 mRNA 

18612 RCJJ69466 

322026 AA203138 

10892 RC.N38882 

11429 RC.R01245 

13334 RCJ76962 

104091 AA417310 

I05246 RC^AA226879 



AA913909 
AA504428 
AI187925 
BE219898 
AL044675 
W73933 
NKL001310 
AW294659 

AB037788 

AW024973 

AL035301 

A1038052 

AW974666 

BE465093 

AA226879 



Hs.153088 
Hs.10487 
Hs.95667 
Hs.173135 
Hs.173081 
Hs^83738 
Hs.13313 
H&34054 

Hs^24961 

H&283675 

Hs.97375 

Hs.19162 

Hs^93024 

Hs.106101 



gb:nz79b10.s1 NCLCGAP.GCB1 Homo sapiens cDNA done 3' similar to gb:M34539 FK506- 

TATA box binding protein (TBP}-associated factor, RNA polymerase I, A, 48kD 
Homo sapiens, done (MAGE: 3954 132, mRNA, partial cds 
F-box protein 30 

dual-specificity tyrosine-{Y)-phosphorylafion regulated kinase 2 

KIAA0530 protein 

casein kinase 1, alpha 1 

cAMP responsive element binding protein-like 2 

Homo sapiens cDNA: FU22488 fe, dons HRC10948, highly similar to HSU79298 Human done 

cleavage and polyadenytation specific rector 2, 100kD subunit 
NPD009 protein 

H^apiensgenefromPAC106H8 

ESTs, Weakly similar to 154374 gene NF2 protein [Haptens] 

ESTs 

hypoffietica! protein FU22557 

gb2rt9c09^1 Stratagene NT2 neuronal precursor 937230 Homo sapiens cDNA done 



IMAGE:663856 3* similar to contains Alu repetitive element, mRNA sequence, 

13300 RCJ67448 T67448 Hs.13101 ESTs 

17147 RCJ*97225_s AW901347 Hs.38592 hypoftefical protein FU23342 

21349 RCJW405205 AA405205 Ks.97960 ESTs, WeaWy similar to T51 146 ring-box protein 1 (H^apiens) 

00294 D49396 AA331881 Hs.75454 peroxiredoxin 3 

33999 M28213 AA535244 Hs.76305 RAB2, member RAS oncogene family 

33259 AA278548 BE379646 Hs.6904 Homo sapiens mRNA full length insert cDNA clone EUROIMAGE 2004403 

29423 AA371418 AA204686 Hs.234149 hypoftetal protein FU20647 

31098 RCJW459568 U66669 Hs.236642 3^rrjxyisobutyTyW>)erizyme A hydrolase 

35272 AA399391 AI828337 Hs.97591 ESTs 

29155 AA046865 A1952677 Hs.1 08972 Homo sapiens mRNA; cDNA DKFZp434P228 (from done DKFZp434P228) 
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311291 
120750 
101002 
133012 
103879 
131281 
115109 
118502 
134100 
131869 
115396 
103860 
135089 
129938 
107508 



125170 
129179 
116262 
123009 
131004 
103317 
132814 
103992 
109258 
110754 
132727 
100341 
134664 
103826 
111678 
101341 
115455 
111192 
129385 
125050 
122105 
121324 
120938 
115001 



AA056319 

RCLAA310499 

J04058 

AA099241 

AA228148.S 

RCLAA443212 

RCLAA256383 

RCN67317 

L07540 

AA484944 

RCJVA282985 

AA203742 

N75611_s 

U79300 

W90095 

AA005190 

AA203147 

RC_AA504125_s 

AA477046 

RCJW479949 

D29833 

X83441 

RC.C15251J 

U77718 

X59710 

RCN2Q814 

AA136382.S 

D63506 

AA256106 

AA165564 

RC.R20628 

L76159 

RCJ\A285068 

RC.M477748 

RCJW235604 

RC.T79951 

RC.AA432278 

RCJ\A404229 

RCJW386260 

RCJW251376 



AA782601 

AI191410 

AV655843 

AA847843 

BE543269 

AA251716 

AJ249977 

AL157488 

AA460085 

AW958547 

AA810854 

AW976877 

AI918035 

AW003668 

N74925 

AA158008 

AL020996 

AW969025 

AI936442 

AA535244 

D29833 

X83441 

D60730 

BE018142 

AU044818 

AW302200 

N27495 

AF032922 

AA256106 

AW162998 

R38487 

NM_004477 

AA876002 

AW021968 

AA172106 

AW970209 

AW241685 

AA404229 

AA386260 

AA251376 

R45088 

AA457395 
N48325 
AA427396 



Hs.319817 ESTs 



Hs.169919 

Hs.62711 

Hs.50252 

Ks.25227 

Hs.88049 

Hs.50150 

Hs.171075 

Hs.33540 

Hs.89081 

Hs.38057 

Hs301198 

Hs.135587 

Hs.38761 

Hs.292444 

Hs.8518 

Hs.109154 

Hs.59838 

Hs.78305 

Hs.2207 

Hs.166091 

Hs.57471 

Hs300954 

Hs.84928 

Hs.6336 

Hs^565 

Hs.8813 

Hs.87507 

Hs.24684 

Hs.169927 

Hs^03772 

Hs.120551 

Ks.109438 

Hs.1 10950 

Hs.111805 

Hs.98699 

Hs.97842 

Hs.104632 



ESTs, Moderately similar to 2109260A B ceD growth factor [H sapiens] 
electron-transfer -flavoproteln, alpha polypeptide (gfutaric aciduria II) 
Homo sapiens, clone IMAGE3351295, mRNA 
mitochondrial ribosomal protein L32 
ESTs 

protein kinase, AMP-acfivated, gamma 3 non-catalytic subunit 

Homo sapiens mRNA; cONA DKFZp564B182 (from clone DKFZp564B182) 

repOcafion factor C (acfivator 1) 5 (36.5kD) 

ESTs, Weakly similar to (JJ309K20.4 [Haptens] 

ESTs 

ESTS 

roundabout (axon guidance receptor, DrosophSa) homotog 1 

Human clone 23629 mRNA sequence 

Homo sapiens cDNA: FU21564 8s, done COL06452 

ESTs 

selenoprotein N 
ESTs 

hypothetical protein FU 10808 
RAB2, member RAS oncogene family 
salivary proOne-rich protein 
Dgase IV, DNA, ATP-dependent 
ESTs 

HunflngtJn interacting protein K 
nuclear transcription factor Y, beta 
KIAA0672 gene product 
hypoftefical protein RJ22626 
syntaxin binding protein 3 
ESTs 

KIAA1376 protein 
ESTs 

FSHD region genel 
toH-fite receptor 10 

Homo sapiens clone 24775 mRNA sequence 

Rag C protein 

ESTs 

ESTs 

EST 

EST 

gb2s10a06.s1 NCI_CGAP_GCB1 Homo sapiens cONA clone IMAGE:684754 3\ mRNA 
gb:yg38g04.s1 Scares infant brain 1NIB Homo sapiens cONA clone IMAGE34896 3', mRNA 



124799 RC_R45088 
sequence. 

122724 RCLAA457395 AA457395 Hs.99457 
117791 RCJM8325 N48325 Hs.93956 
121895 RCJW427396 

similar to contains Aiu repetitive elementcontains MER1ZI2 MER12 repetitive element ;, mRNA sequence. 



ESTs 
EST 

gfczw33a0ls1 Scares ovary tumor NbHOT Homo sapiens cDNA done IMAGE771Q50 3 1 



108244 RCJW062839 
3*. mRNA sequence. 
117852 RCJM94Q8 
105298 RC_AA205432 
122432 RC^AA447400 
124627 RC.N74625 



AA062839 gb2rn05c09.s1 Stratagene corneal stroma (937222) Homo sapiens cDNA done 1MA6E513232 

AW877787 Hs.136102 WAA0853 protein 

R77854 . Hs350693 Krueppekelated zinc finger protein 

AA447400 Hs.1 87684 ESTs, WeaWy similar fa B34087 hypofoetical protein [H.sapiens] 

N74625 gb:za55cQ3.s1 Scares fetal Gver spleen 1NFLS Homo sapiens cDNA clone IMAGE296452 3* 



similar to gb:M14338 VITAMIN K-DEPENDENT PROTEIN S PRECURSOR (HUMAftycomains OFRb3 OFR repetitive element ;, mRNA sequence. 



115141 RC_ J AA258071 AA465131 
128636 U49065 U49065 
115373 RC_jAA282197 AA664852 
114651 RC^AAI 01400 AA101400 
132796 RCJ\A180487 NMJ0S283 
103749 RCJI35583 AL135301 
107328 T83444 AW959891 
115349 RC^AA281563 AF121176 
111490 RC_R06862 R06862 
similar to contains L1 repetitive element ; 
103763 AA085354 AA085291 
contains Alu repetitive element, mRNA sequence 



Homo sapiens done 2521 8 mRNA sequence 
interieukin 1 receptor-tike 2 
CGW)7p 
ESTs 

transforming, acidic cofled-coD containing protein 1 
rrypoffietical protein FU10849 
K1AA0887 protein 

DEAD/H (Asp-GIUnMa-Asp/His) box polypeptide 16 

gb:yf11e09.s1 Scares fetal Gver spleen 1NFLS Homo sapiens cONA done (MAGE126568 J 
mRNA sequence. 

gb2n01g06^1 Stratagene colon HT29 (937221) Homo sapiens cDNA done 3* similar to 



Ks.64001 
Hs.102865 
Hs.181022 
Hs.189960 
Hs.173159 
Hs^768 
Hs.76591 
Hs.12797 



118791 RC.N75520 N75520 

116644 RC.F03032 F03032 

116823 RCJH56485 AW204742 
[Rsapiens] 

108940 RC_AA148603 AA148603 
(MAGES67198 3", mRNA sequence, 

112218 RC.R50057 R50057 

116557 RCJ)2Q572J D20572 

133649 U25849 U25849 
131745 RC.C20746 



Hs.261003 ESTs, Moderately similar to B34087 hypothetical protein [H sapiens] 

Hs^90278 ESTs, Weakly similar to B34087 hypofoefcal protein [H^apiens] 

Hs.143542 ESTs, Highly similar to CSAJWMAN COCKAYNE SYNDROME WD-REPEAT PROTEIN CSA 

gbzo09e04.s1 Stratagene neuroepfiheBum NT2RAMI 937234 Homo sapiens cDNA clone 

Hs272251 Homo sapiens mRNA; cONA DKFZp586M1418 (from done DKFZp586M1418) 

Hs.90171 EST 

Hs.75393 add phosphatase 1, soluble 

Hs.31447 ESTs, Moderately sfmQar to A46010 X-finked refinopaffry protein [H^apiens] 
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AA251548 

H29882 

AA347919 

AA018298 

AF025771 

N67946 

AB020700 



BE276055 

R66740 

MH.003478 



116801 RC.H43879 H43879 
sequence. 

115006 RCJ*A251548 
123424 RC_AA598500 
5 120831 RCJW347919 

103691 AA018298 
121556 RC_AA412491 
111193 RC.N67946 
132061 RCLAAQ58946 

10 134575 RC.M194568J AA194568 

115050 RCLAA252794 AA252794 
420208 U31799 
133735 AC002045_xpt1 
12B546 Z21305 

15 111946 RCJW0697 

124879 RC.R735B8 R73588 

115683 AA410345 AF255910 

103692 AA018418 AW137912 
(CACNA1F) gene, complete cds; HSP27 

20 103767 AA089688 BE244667 

125266 W90022 W90022 
PRECURSOR [Rsapiens] 

135235 AA435512 AW298244 

134497 RC.AA404494 BE258532 

25 426754 RC.AA278529J NM_014264 

412177 RC_jAA342828_s Z23091 

132000 RCLAA044644 AW247017 

124738 RC.AA044644 T07568 

324000 RC,AA196729j AA604749 

30 106896 RCJW196729J AW073202 

132000 RC.AA025B58 AW247017 

129577 RCJVA025858 N75346 

107091 RCJW233519 AI949109 

130296 RC_N52271 D31139 

35 102855 RCJJS8399 NNL003528 

113689 RC_M093874 AB037850 

100939 RCJU\279667_s L04288 

130430 RC.H22556 W27893 

106734 RC_N45979_s BE295690 

40 intersedin 2 long isofbrm (I7SN2) mRNA 

135148 RC_AA431288_s AA306478 

134221 RCJW609862 BE280456 

105376 RC_N35583 AW994032 

124541 U77718 AF112222 

45 134546 AA203147 AL020996 

134000 RC.W93092 AW175787 

125656 RC.W93092 AW516428 

100939 RC_N58561_s L04288 

125556 RC.W93092 AW516428 

50 101779 RCJV69385.S BE543412 

332489 RCLR22947 R23053 

133000 RC.N38959J AL042444 

125905 RC.N38959J AI678638 

129000 RC.H73050.S AA744902 

55 100920 RC_H73050_s X54534 



Hs3830 
Hs.85938 
Hs.88009 
Hs.95972 



gb.7o69h09^1 Soares breast 3NbHBst Homo sapiens cDNA done IMAGE: 183233 3*. mRNA 

Hs.87885 EST 

Hs.162614 ESTs 

Hs.96889 EST 

Hs.103332 ESTs 

Hs£0123 zinc finger protein 189 

Hs.1 17569 ESTs 

WAA0893 protein 
EST 
ESTs 

saver (mouse homotag) Eke 

Hs.110613 KIAA0220 protein 

Hs.101299 culfin5 

Hs.76666 C9orf10 protein 

Hs.101533 ESTs 

Hs.54650 junctional adhesion molecule 2 

Hs227583 Homo sapiens chromosome X map Xp1 1 .23 L-type calcium channel alpha-1 subunfi 
pseudogene, complete sequence; and JM1 protein, JM2 protein, and Hb2£ genes, complete cds 

H&296155 CGM00 protein 

Hs.186809 ESTs, Highly similar to LCT2_HUMAN LEUKOCYTE CELL-OERIVED CHEMOTAXIN 2 

Hjl2935Q7 ESTs 

Hs.251871 CTP synthase 

Hs.172052 serine/threonine kinase 18 

Hs.73734 glycoprotein V(pbtetet) 

Hs^6978 melanoma antigen, family A, 3 

Hs.137158 ESTs 

Hs.190213 ESTs 

Hs.334825 Homo sapiens cDNA RJ 1 4752 fis, clone MT2RP3003071 

Hs.36978 melanoma antigen, family A, 3 

Hs.82906 CDC20 (ceil division cycle 20, S. cerevisiae, homobg) 

Hs .246885 hypothetical protein FU20783 

Hs.154103 UM protein (similar to rat protein kinase C-bmding enigma) 

Hs.2178 H2B histone family, member Q 

Hs.16621 DKFZP434I1 16 protein 

Hs.297939 cathepsinB 

Hs. 1 50580 putative translation initiation factor 

Hs.288173 Homo sapiens cDNA: FU21747 fis, clone COLF5160, highly similar to AF182198 Homo sapiens 

Hs.95327 CD3D antigen, delta polypepfide (TiT3 complex) 

Hs.80248 RNA-binding protein gene with multiple splicing 

Hs.8768 hypoffietical protein FU10849 

Hs.44499 pinln, desmosome associated protein 

Hs.8518 selenoprotein N 

Hs.334841 selenium binding protein 1 

Hs.78687 neutral sphingomyelinase (N-SMase) activation associated factor 

Hs.297939 cathepsin B 

Hs.78687 neutral sphingomyelinase (N-SMase) activation associated factor 

Hs.250505 retinolc acid receptor, alpha 

NA HuOIChipRedos 

Hs.62402 p21/Wc42/tac1*ctrvated kinase 1 (yeast Ste20-related) 

Hs.6456 chaperonln containing TCP1 , subunit 2 (beta) 

Hs.107767 hypothetical protein PR01489 

Hs, 278994 Rhesus blood group, CcEe antigens 
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TABLE 1A 

Table 1 A shows the accession numbers for those pkeys lacking unigenelD's for Tables 1 . The pkeys in Table 7 lacking unlgenelFs are represented within 
Tables 1-6A. For each probeset we have listed the gene duster number from which the oligonucleotides were designed. Gene clusters were compted 
using sequences derived from GenbanX ESTs and mRJslAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Toob {DoubleTwist OaWand Caiifomia). The Genbank accession numbers for sequences comprising each duster are listed In the •Accession' 
cdumn. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 



108469 
124106 
108501 
108562 
125008 
125020 
125066 
116661 
125104 
124575 
125263 
116845 
118417 



118584 
103743 
103744 
103746 
103761 
103763 
120209 
1202B4 
112540 
111904 
121059 
121094 
114106 
130091 
122264 
108280 
129961 
130529 
108309 
107832 
123731 
116571 
132225 
125017 
125063 
125064 
100964 
125118 
102269 
125150 
116801 
118111 
118129 
118329 
118475 
111490 
111514 
104534 
120340 



116761J AA0794B7 AA128547 AA128291 AA079587 AA079600 
125446.1 H12245AA094769R14576 
13684 -12 AAD83256 

3637511 AA100795 AF020589 AA074629 AA075946 AA100849 AA085347 AA1 26309 AA079311 AA079323 AA085274 

1802095 J T91251 T64891 T85665 

1 16017J T69981 T69924 AA078476 

1814993.1 T86284T81933 

1532859.1 R61504F04247 

413347 J T95590 AA703278 H62764 

1666649.1 N68168N69188 N90450 

1547 J AA098878 W88942 

393481 1 AA649530AA659316 H64973 

37186J AF080229 AF080231 AF080230 AF080232 AF080233 AF080234 BE550633 A1636743 AW614951 BE467547 AIB80833 
AI633818 N29986 U87592 U87593 U87590 U87591 S46404 U87587 AA463992 AW206802 A1970376 AI583718 AI672574 
N25695 AW665466 AI818326 AA126128 AM80345 AW013827 AA248638 A1214968 AA204735 AA207155 AA206262 AA204833 
AW003247 AW49S808 AI080480 A1631703 AI651023 AI867418 AW818140 AA502500 A1208199 AI671282 A1352545 BE501030 
A! 552535 BE465762 AA206331 AW451866 AA471088 AA206342 AA204834 AA206100 AW021661 AA332922 N66048 
AA703396 H92278 AW139734 H92683 U87589 U87595 H69001 U87594 BE466420 AI624817 BE46661 1 AI206344 AA574397 
AA348354 A1493192 

532052.1 AW136928 AI685655 BE218584 BE465078 N68963 AA975338 BE147199 N76377 
112194 1 AA075998 AA075999 AA070936 AA070896 AA129207 AA078942 AA0707B3 AA078941 
1 14161.1 AA079267 AA076003 
113452 1 AA075000AA081876 

1 14208.1 AA765163 AW298222 AA126126 AA085138 AA076068 

48290.6 AA085291 AA085354 

1531817.1 F02951Z40892 F04711 

158963.1 AA179656 AA182526 AA182603 

1605263.1 R69751 R70467 H69771 H80879 H80878 

1719336.1 Z41572 R39330 

273450 J AA393283AA398628 

275729.1 AA402505AA398900 

1 182096.1 AW602528 BE073859 Z38412 

23961.-3 W88999 

296527.1 AA436837AA442594 

110682.1 AA065C69AA085108 

1706092J R23053 R79884 R76271 

158447.1 AA178953AA192740 

AA069818 AA069971 AA059923 AA069908 
AA021473 
AA609839 
D45652 
AA128980 
T68875 
T85352 
T85373 



111495.1 

genbanK_M021473 
genbanK_AA609839 
genbankJ345652 
genbanJLM128980 
genbank_T68875 
genbank.T85352 
genbankJT85373 
entre^J00212J00212 
149288.1 R10606 T97620 AA576309 
enire*_U30245U30245 

NOTJ : OUND.entre^W38240 W38240 



genbank.H43879 
genbanOJ55493 
genbank_N57493 
genbank w N53520 
genbank_N56845 
genbanK.R06862 
genbank_R07998 
R22303_at R22303 
genbank_M206828 



H43879 
N55493 
N57493 
N 53520 
N66845 



R07998 
AA206828 
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10 



15 



20 



25 



30 



35 



40 



45 



120376 genbank_AA227469 

104787 gentenkJW)Z7317 

120409 genbanX_M235050 

120745 genbanXJ\A302809 

120809 genbank u AA346495 

120839 genbanJe J AA348913 

113702 genbanleT97307 

115001 genfaanKJ\A251376 

122562 genbanU\A452156 

122635 genbanK^AA454085 

108244 genbanluAA062839 

108277 ganbankJ\A0S4859 

122723 gaibanJuAA457380 

124028 genbank_F04112 

108403 genbankuAA075374 

122860 gaiban)LM464414 

108427 gsnbanLAA076382 

108439 genbank>A078986 

131353 231290.1 
AA969360 

108533 genbankJ\A084415 

117031 genbanLH88353 

124254 genbank_H69899 

101447 entrezjrf21305 

101458 entre2LM22092 

124577 genbank_N68300 

108940 gaibanLAA148603 

108941 genbank_AA148650 
124627 genbankJ474625 
124720 144582J 

124793 genbank_R44519 

124799 genbankj*45088 
117683 genbanK_N40180 
117732 genbank_N46452 

124991 genbanK_T50116 
119023 genban)LN98488 
119239 95573_2 



AA227469 

AA027317 

AA235050 

AA302809 

AA346495 

AA348913 

T97307 

AA251376 

AA452156 

AA454085 

AA062839 

AA064859 

AA457380 

F04112 

AA075374 

AA464414 

AA076382 

AA078988 

AW41 1259 H23555 AW015049 AI584275 AW0 15886 AW068953 AW014085 AI027260 R52686 AA918278 Al 129462 

N34869 AI946416 AA534205 AA702483 AA705292 

AA084415 

H88353 

M21305 
M22092 



N68300 
AA148603 
AA148650 
N74625 

R05283 R11056 

R44519 

R45088 

N40180 

N46452 

T50116 

N98488 

T11483T11472 



119558 NOT„FOUND„entre^W38194 



W38194 



119654 genban)LW57759 W57759 

105246 genbankJ\A226879 AA226879 

121350 genbanHJ^A405237 AM05237 

121558 genbanKJ\A412497 AA412497 

105985 genbanKJU406610 AA406610 

100071 enteU\28102A28102 

114648 genbankJ\A101056 AA101056 

121895 genbankuAA427396 AA427396 

100327 entre^D55640D55640 

123315 714071J AA496369 AA496646 
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Unique Eos probeset identifier number 
Accession number used for previous patent filings 
Exemplar Accession number, Genbank accession number 
Unigene number 
Unigene gene tifie 



TABLE 2: 

Pkey: 
5 Accession: 
ExAccn: 
Unigene tD: 
Unigene Tffle: 

10 Pkey Accession ExAccn Unigene© 

100420 100420 D88S83 Hs. 11 8893 

100484 100484 NM.005402 HsJ288757 

100991 100991 J03836 Hs.82085 

15 101168 101168 NM_0Q53Q8 Hs^11569 

101261 101261 D30857 Hs.82353 

101447 101447 M213Q5 

101543 101543 M31166 Hs.2050 

101560 101560 AW958272 Hs.347326 

20 101714 101714 M68374 Ha211587 

101838 101838 BE243845 Hs.75511 

102012 102012 BE259035 Hs.118400 

102164 102164 NM_000107 Hs.77602 

102283 102283 AW161552 Hs.83381 

25 102564 102564 U59423 Hs.79067 

102759 102759 NMJJ05100 Hs.788 

102804 102804 NMJ302318 Hs.83354 

102898 102898 NMJ302205 Hs.149609 

103036 103036 M13509 Hs.83169 

30 103095 103095 NIVL0Q5424 Hs.78824 

103165 103166 AA159248 Hs.180909 

103280 103280 U84722 Hs76206 

103850 103850 AA187101 Hs.213194 

104592 104592 AW630488 Hs.25338 

35 104786 104786 AA027167 Hs.10031 

104865 104865 T79340 Hs.22575 

104952 104952 AW076098 Hs.345588 

105178 105178 AA313825 Hs^1941 

105330 105330 AW338625 Hs.22120 

40 105729 105729 H46612 H^293815 

105977 105977 AK001972 Hs.30822 

106031 106031 X64116 Hs, 171844 

106155 106155 AA425414 Hs.33287 

106423 106423 AB020722 Hs.16714 

45 107174 107174 BE122762 Hs.25338 

107295 107295 AA186629 Hs.80120 

108756 108756 AA1 27221 Hs.117037 

108888 108888 AA135606 Hs. 189384 

109166 109166 AA219691 Hs.73625 

50 109768 109768 F06838 Hs.14763 

110906 110906 AA035211 Hs.17404 

111006 111006 BE387014 Hs.166146 

111133 111133 AW580939 Hs.97199 

113073 113073 N39342 Hs.103042 

55 113923 113923 AW953484 Hs.3849 

115061 115061 AI751438 Hs.41271 

115145 115145 AA740907 Hs.88297 

. 115947 115947 R47479 Hs.94761 

116339 116339 AKD00290 Hs.44033 

60 116589 116589 AI557212 Hs.17132 

117023 117023 AW070211 Hs.102415 

117563 117563 AF055634 Hs.44553 

118475 118475 N66B45 

119073 119073 BE245360 Hs.279477 

65 119174 119174 R71234 

119416 119416 T97186 

121335 121335 AA404418 

123160 123160 AA488687 Hs.284235 

123523 123523 AA608588 

70 123964 123964 C13961 

124315 124315 NNL0Q5402 Hs.288757 

124669 124669 AI571594 Hs. 102943 

124875 124875 AI887664 Hs.285814 

125103 125103 AA570056 Hs.122730 

75 125565 125565 R20840 



UnigeneTrtte 

Melanoma associated gene 
v-ral simian leukemia vira! oncogene hom 
serine (or cysteine) proteinase inhfoito 
G protein-coupled receptor kinase 5 
protein C receptor, endothelial (EPCR) 
gb^iuman alpha sateffitB and satellite 3 
pentaxtn-rebted gene, rapidly induced b 
intercellular adhesion molecule 2 
phosphoGpase A2, group IVA (cytoso&c, 
connective (issue growth factor 
singed (DrosophllaHike (sea urchin fas 
damage-specific DNA binding protein 2 (4 
guanine nucleotide binding protein 11 
MAO (mothers against decapentaplegte, Dr 
A kinase (PRKA) anchor protein (g ravin) 
lysyi axidase-fike 2 
tntegrin, alpha 5 (fibrcnecfin receptor, 
matrix metallo proteinase 1 (interstitial 
tyrosine kinase with immunoglobulin and 
peroxiredoxin 1 

cadherin 5, type 2, VE-cadherin (vascula 
hypoftetical protein MGC10895 
protease, serine, 23 
WAA0955 protein 

B-ceD CllVr/mphorna 6, member B (zinc fi 
desmopiakin{DPI, DPIl) 
AD035 protein 
ESTs 

Homo sapiens HSPC285 mRNA, partial cds 
hypothetical protein RJ 1 1 1 1 0 
Homo sapiens cDNA: FU22296 Ms, done H 
nuclear factor l/B 

Rho guanine exchange factor (GEF) 15 
ESTs 

UDP^-acetyl^a^aiactDsamme:polyp 
ESTs 

gb:zl10a05.s1 Scares j)regnanLuterus_NbH 
RAB6 interacfing, kinesin-Oke (rabkines 
ESTs 
ESTs 

Homer, neuronal immediate early gene, 3 
complement component C1q receptor 
microtubule-assodated protein 1B 
hypothe&a) protein FU22041 similar to 
Homo sapiens mRNA full length insert cDN 
ESTs 

WAA1691 protein 

dipepfidyi peptidase 8 

ESTs, Moderately similar to 154374 gene 

Homo sapiens mRNA; cDNA DKFZp586N0121 (f 

unc5 (Ceiegans homolog) c 

gb2a46c1 1 .si Soares fetal Bver spleen 

ESTs 

gb:yi54c08.$1 Soares placenta Nb2HP Homo 
gb:ye5Gh09.s1 Soares fetal fiver spleen 
gb2w37e02s1 Soares_tDtaLfetus_Nb2HF8_ 
ESTs, Weakly similar to 138022 hypothefi 
gbae54e06^1 Stratagene hmg carcinoma 
gteC13961 Ciontech human aorta polyA+mR 
v-ral simian leukemia viral oncogene hom 
hypofoetical protein MGC12916 
sprouty (DrosophUa) homotag 4 
ESTs, Moderately similar to K1AA1 215 pro 
gb.yg05c08j1 Soares infant brain 1NIB H 
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Hs.44033 
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Hs.17132 


AW070211 


Hs.102415 


AF055634 


HS44553 


N66845 
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Hs.279477 


R71234 




T97186 




AA404418 
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Hs.284235 


AA608588 




C13961 
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126511 
126649 
449602 
127402 
128992 
129188 
129371 
129765 
129884 
130639 
130828 
131080 
131182 
131573 
131756 
131881 
132083 
132358 
132456 
132576 
132718 
132760 
132968 
133061 
133161 
133260 
133491 
133550 
133614 
133891 
133913 
1339B5 
134088 
134299 
116470 
134989 
135073 
100114 
100143 
100208 
100405 
100455 
100618 
100658 
100718 
100828 
100991 
101110 
101156 
101184 
101317 
101345 
101475 
101496 
101543 
101560 
101592 
101634 
101682 
101720 
101744 
101837 
101840 
101864 
101966 
102013 
102059 
102283 
102378 
102460 
102499 
102560 
102589 
102645 
102693 
102759 



126511 
126649 



127402 
128992 
129188 
129371 
129765 
129884 
130639 
130828 
131080 
131182 
131573 
131756 
131881 
132083 
132358 
132456 
132676 
132718 
132760 
132968 
133061 
133161 
133260 
133491 
133550 
133614 
133691 
133913 
133985 
134088 
134299 
116470 
134989 
135073 
100114 
100143 
100208 
100405 
100455 
100618 
100658 
100718 
100828 
100991 
101110 
101156 
101184 
101317 
101345 
101475 
101496 
101543 
101560 
101592 
101634 
101682 
101720 
101744 
101837 
101840 
101864 
101966 
102013 
102059 
102283 
102378 
102460 
102499 
102560 
102589 
102645 
102693 
102759 



T92143 

AA001860 

AA001860 

AA358869 

H04150 

NM_001076 

X06828 

M36933 

AF055581 

AI557212 

AW631469 

NM001955 

A1824144 

AA040311 

AA443966 

AW361018 

BE386490 

NR.003542 

AB011084 

N92589 

NMJ104600 

AA125985 

AF234532 

AI186431 

AW021103 

AA403045 

BE619053 

Al 129903 

NMJ03003 

M85289 

AU076964 

L34657 

AI379954 

AW580939 

AI272141 

AW968058 

W55956 

X02308 

AU076465 

NM.002933 

AW291587 

AW888941 

AI752163 

U56725 

BE29592B 

AL048753 

J03836 

A1439011 

AA340987 

NM_001674 

L42176 

NM.005795 

BE410405 

X12784 

M31166 

AW958272 

AF064853 

AV650262 

AF043045 

M69043 

AI879352 

M92843 

AA236291 

BE392588 

X96438 

BE616287 

AI752666 

AW161552 

AU076887 

U48959 

BE243877 

RS7457 

AU076728 

AL119566 

AA532780 

NM_005100 



Hs£7958 

Hs279531 

Hs279531 

Hs^27949 

Hs.107708 

Hs.109225 

Hs.110802 

Hs.1238 

Hs.13131 

Hs.17132 

Hs^03213 

Hsu2271 

Hs£3912 

H&28959 

Hs.31595 

Hs.3383 

Hs^79663 

Hs.46423 

Hs.48924 

Hs.261038 

Hs£54 

Hs.56145 

Hs.61638 

Hs.296638 

Hs.6631 

Hs.6906 

Hs.170001 

Hs.74669 

Hs.75232 

Hs.211573 

Hs.7753 

Hs.78146 

Hs.79025 

Hs.97199 " 

Hs^3484 

Hs.92381 

Hs.94030 

Hs.82962 

Hs.278441 

Hs.78224 

Hs.82733 

Hs.75789 

Hs.114599 

Hs.180414 

Hs.75424 

Hs^Q3649 

Hs.82085 

Hs.86386 

Hs.75693 

Hs.460 

Hs.8302 

Hs.152175 

Hs.76288 

Hs.119129 

Hs^050 

Hs.347326 

Hs.91299 

Hs.75765 

Hs.81008 

Hs.81328 

Hs.1 18625 

Hs.343586 

Hs.183583 

Hs.75777 

Hs.76095 

Hs.178452 

Hs.76669 

Hs.83381 

Hs.28491 

Hs.211582 

Hs.76941 

Hs.63984 

Hs.8867 

Hs.6721 

Hs.183684 

Hs.788 



EGF JM74atrophHirwelated protein 

ESTs 

ESTs 

SEC13 (S. cerevisiae)-Gxe 1 
ESTs 

vascular ceO adhesion molecule 1 
vonWfltebrand factor 
amelogenin (Y chrornosorne) 
lysosomal 

ESTs, Moderately similar to I54374 gene 
ESTs 

endothetin 1 
ESTs ♦ 
ESTs 
ESTs 

upstream regutetory element binding prot 
Pirtn 

H4 histone family, member G 
K1AA0512 gene product ALEX2 
ESTs, Weakly similar to 138022 hypothefi 
Sjogren syndrome antigen A2 (60kD, ribon 
thymosin, beta, identified in neuroblast 
myosin X 

prostate differentiation factor 
hypothetical protein FU 20373 
Homo sapiens cDNA: FU23197 fs, clone R 
eukaryofJc translation initiation factor 
vesicle-associated membrane protein 5 (m 
SEC14 (S. cerevisiaeHike 1 - 
heparan sulfate proteoglycan 2 (pertecan 



pjatetet/enaotneuai ceu aoneston moiec 
WAA0096 protein 

complement component C1q receptor 

SRY (sex determining region Y>box 4 

nudix (nucleoside diphosphate Dnked moi 

Homo sapiens mRNA; cDNA DKFZp586E1624 (f 

thymidytate synthetase 

K1AA0015 gene product 

ribonudease, RNase A family, 1 (pancrea 

nldogen2 

N-myc downstream regulated 
coQagen, type VIII, alpha 1 
heat shock 70kO protein 2 
inhibitor of DMA binding 1, dominant neg 
small inducible cytokine A2 (monocyte ch 
serine (or cysteine) proteinase hhfcrto 
myeloid cell leukemia sequence 1 (BCL2w 
procarboxypeptidase (angiotensinase C 
activating transcription factor 3 
four and a half LIM domains 2 
calcitonin receptor-fike 
caipain 2, (m/ll) large subunit 
coBagen, type IV, alpha 1 
pentaxln-related gene, rapidly induced b 
intercellular adhesion molecule 2 
guanine nucleotide binding protein (G pr 
6R02 oncogene 

tUamin B, beta (acCn-binding protein-2 
nuclear factor of kappa Oght polypepfjd 
hexokinasel 

zinc finger protein homologous to Zfp-36 
serine (or cysteine) proteinase inhfi>ito 



immediate early response 3 
catenin (cadterrn-associated protein), a 
nicotmamide N-rrtemyltransferase 
guanine nucleotide binding protein 1 1 
spemikline/spermine Nl-acetytoansferase 
myosin, Gght poJypepSde kinase 
ATPase, Na^+transporfing, beta 3 poly 
cadhertn 13, H-cadherin (heart) 
cysteine-nch, angiogenic inducer, 61 



eukaryotic translation initiation factor 
A kinase (PRKA) anchor protein (g ravin) 
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102882 102882 AI767738 

102915 102915 XD7820. 

102960 102960 AJ904738 

10302) 103020 X53416 

103038 103036 M13509 

103080 103080 AU077231 

103138 103138 X65965 

103195 103195 AA351647 

103371 103371 X91247 

103471 103471 Y00815 

104447 104447 AW204145 

104783 104783 AA533513 

104865 104865 T79340 

104894 104894 AF065214 

105113 105113 AB037816 

105196 105198 W84893 

105263 105263 AW388633 

105330 105330 AW338625 

105492 105492 AI805717 

105594 105594 AB024334 

105732 105732 AW504170 

105882 105882 W46802 

106031 106031 X64116 

106222 106222 AA356392 

106263 106263 W21493 

106366 106366 AA166715 

106634 106634 W25491 

106793 106793 H94997 

106842 106842 AF124251 

106890 106890 AA489245 

106974 106974 AI817130 

107061 107061 BE147611 

107216 107216 D51069 

107444 107444 W28391 

108507 108507 AI554545 

108931 108931 AA147186 

109195 109195 AF047033 

109456 109456 AW956580 

110411 110411 AW001579 

110906 110906 AA035211 

111091 111091 AA300067 

111378 111378 AW160993 

111769 111769 AW629414 

112951 112951 AA307634 

113195 113195 H83265 

113542 113542 H43374 

113847 113847 NW1005032 

113947 113947 W84768 

115061 115061 A1751438 

* 115870 115870 NMJJ05985 

116228 116228 AI767947 

116314 116314 AI799104 

117023 117023 AW070211 

117156 117156 W73853 

117280 117280 M18217 

119866 119866 AA496205 

121314 121314 W07343 

121822 121822 AI743860 

122331 122331 AL133437 

123160 123160 AA488687 

124059 124059 BE387335 

124358 124358 AW070211 

124726 124726 NMJ303654 

125167 125167 AL137540 

125307 125307 AW580945 

107985 107985 T40064 

125598 125598 T40064 

413731 413731 BE243845 

116024 116024 AA088767 

418000 418000 AA932794 

126399 126399 AA088767 

127566 1Z7566 AI051390 

128453 128453 X02761 

128515 128515 BE395085 

128623 128623 BE076608 

1286® 128669 W2B493 



Hs.290070 geboSn (amyloidosis, Fcnnish type) 

Hs.2258 matrix metaltoproteinase 10 (stromsJyson 

Hs.76053 DEAD/H (Asp^!u^Asp/His) box polypep 

Hs.195464 filamin A, alpha (ac^n-blnding protetn- 

Hs.83169 matrix metaltoprotetnase 1 (interstitial 

Hs.8332 cycfin D1 (PRAD1: parathyroid adenomatos 
gkH^apiens SGD-2 gene for manganese su 

Hs.2542 eukaryotic translation elongation factor 

Hs.13046 thioredoxin reductase 1 

Hs.7521 6 protein tyrosine phosphatase, receptor t 

Hs.156044 ESTs 

Hs.93659 protein disulfide isomerase related prot 

Hs.22575 B-ceQ CLL/lymphoma 6, member B (zinc fi 

Hs.18858 phosphoGpase A2, group IVC (cytosoDc, 

Hs.8982 Homo sapiens, done IMAGE:3506202, mRNA, 

Hs.9305 angbtensin receptor-fike 1 

Hs.6682 solute carrier famDy 7, (caConic amino 

Hs.22120 ESTs 

Hs^89112 CGI-43 protein 

Hs.25001 tyrosine 3-monoffitygenaseytryptophan 5-mo 

Hs.274344 hypothetical protein MGC1B42 

Hs.81988 disabled (DrosophQa) homolog 2 (mitogen 

Hs.171844 Homo sapiens cDNA: FU22296 fis. done H 

Hs.21321 Homo sapiens done FLB9213 PR02474 mRNA, 

Hs .25329 hypoftetica] protein FU14005 

Hs.336429 RIKEN cDNA 9130422N19 gene 

Hs.288909 hypofoetfcaJ protein FU22471 

Hs.16450 ESTs 

Hs.26054 novel SH2-containing protein 3 

Hs.88500 mttogen-activated protein kinase 8 inter 

Hs.9195 Homo sapiens cDNA FU13698 fis, done PL 

Hs.6354 stromal ceD derived factor receptor 1 

Hs.211579 melanoma ceil adhesion molecule 

Hs.343258 proffierafion^ssociated 264, 38kD 

Hs.68301 ESTs 

gbzo38d01.s1 Stratagene endothelial eel 

Hs. 132904 solute carrier family 4, sodium bicarbon 

Hs.42699 ESTs 

Hs.9645 Homo sapiens mRNA for KIAA1741 protein, 

Hs.17404 ESTs 

Hs.33032 hypoftefical protein DKFZp434N185 

Hs.326292 rrypofoefical gene DKFZp434A1 114 

Hs.24230 ESTs 

Hs.6650 vacuolar protein sonlng 45B (yeast homo 

Hs.8881 ESTs, Weakly similar to S41044 chromosorn 

Hs.7890 Homo sapiens mRNA for KIAA1671 protein, 

Hs.4114 piastin3(Tfeoform) 

gbzh53d03.s1 Soares_fetaLBver_spleen_ 

Hs.41271 Hcmo sapiens mRNA full length insert cON 

Hs.48029 snail 1 (drosophila homolog), zinc ftnge 

Hs.50841 ESTs 

Hs.178705 Homo sapiens cDNA FU1 1333 fis, done PL 

Hs.102415 Homo sapiens mRNA; cDNA DWFZp586N0121 (f 
ESTs 

Hs.172129 Homo sapiens cDNA: RJ21409 fis, done C 

Hs.193700 Homo sapiens mRNA; cDNA DKFZp586I0324 (f 

Hs.182538 phospholipid scramblase 4 

metaltothtonetn 1E (functional) 

Hs.1 10771 Homo sapiens cDNA: FU21904 fis, done H 

Hs^84235 ESTs, WeaWy similar to I38Q22 hypometi 

Hs.283713 ESTs, WeaWy similar to S64054 hypothec 

Hs.102415 Homo sapiens mRNA; cDNA DKF2p586N0121 (f 

Hs.1 04576 carbohydrate (teratan sulfate GaV6)su! 

Hs.102541 netrfn4 

Hs.330466 ESTs 

Hs.71968 Homo sapiens mRNA; cDNA DKFZp564F053 (fr 

Hs.71968 Homo sapiens mRNA; cONA DKFZp564F053 (fr 

Hs.7551 1 connecfive tissue growth factor 

Hs .83883 transmembrane, prostate androgen Induced 

Hs.83147 guanine nucleotide binding protein-like 

Hs .83883 transmembrane, prostate androgen induced 

Hs.116731 ESTs 

Hs287820 Sbronecfinl 

Hs. 10085 type I transmembrane protein Pn14 

Hs.1 05509 CTL2gene 

Hs.180414 heat shock 70kD protein 8 
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128914 


128914 


AW867491 


HsJ 07125 


1291© 


129188 


NM_00107B 


Hs.109225 


129285 


129265 


AA530892 


Hs.171695 


129468 


129468 


AW410538 


Hs.111779 


101638 


101838 


BE243845 


Hs.75511 


129619 


123619 


AA209534 


Hs.284243 


1ZJ762 


129762 


AA453694 


Hs. 12372 


130018 


130018 


AA353093 




130178 


130178 


U20982 


Hs.1516 


130431 


130431 


AW505214 


Hs.155560 


130553 


130553 


AF062649 


Hs .252587 


130639 


130639 


AI557212 


Hs.17132 


130686 


130686 


BE546267 


Hs.337986 


130818 


130818 


AW190920 


Hs.19928 


130899 


130899 


AI077288 


Hs £96323 


131080 


131080 


NWL001955 


Hs^271 


131091 


131(01 


AJ271216 


Hs.22880 


131182 


131182 


AI824144 


Hs.23912 


131319 


131319 


NMJJ03155 


Hs.25590 


131328 


131328 


AW939251 


Hs.25647 


131328 


131328 


AW939251 


Hs.^647 


131555 


131555 


T47364 


Hs.278613 


131573 


131573 


AA040311 


II. AAA PA 

Hs.28959 


131756 


131756 


AA443966 


Hs.31595 


131909 


131909 


NMJM6558 


Hs^744l1 


132046 


132046 


AJ359214 


Hs.1 79260 


132151 


132151 


BE379499 


Hs.173705 


132187 


132187 


AA235709 


Hs.4193 


132314 


132314 


AF1 12222 


Hs.323806 


132398 


132398 


AA876616 


Hs.16979 


132490 


132490 


N^.001^0 


Hs.4980 


132546 


132546 


M24283 


Hs.168383 


132716 


132716 
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20 105169 105169 BE245294 Hs.160789 S164 protein 

130401 130401 BE396283 Hs.173987 eukaryotic translation initiation factor 

130114 130114 AA233393 Hs.14992 hypothetical protein FU1 1151 

105337 105337 A1468789 Hs.347187 myotubutarin related protein 1 

105376 105376 AW994032 Hs.8766 hypothetical protein FU10849 

25 131962 131962 AK000046 Hs.343877 hypothetical protein FU20039 

128658 128658 BE397354 Hs.324830 diptheria toxin resistance protein requi 

105508 105508 AA173942 Hs.326416 Homo sapiens mRNA; cDNA DKFZp564H1916 (f 

135172 135172 AB028956 Hs.12144 KIAA1033 protein 

132542 132542 AL137751 Hs.263671 Homo sapiens mRNA; cDNA DKFZp434l0812 (f 

30 105659 105659 AA283044 Hs.25625 hypothetical protein FU1 1 323 

105674 105674 A1609530 Hs.279789 histone deacetylase 3 

105722 105722 AI922821 Hs.32433 ESTs 

115951 115951 BE546245 Hs.301048 sec134ike protein 

105985 105985 AA406610 gb:zv15b10.s1 SoaresJ«MPu_S1 Homosapi 

35 131216 131216 AI815486 Hs.243901 Homo saptens cDNA FU2073B fs, done HE 

113689 113689 AB037850 Hs. 16621 DKFZP434111 6 protein 

130839 130839 AB011169 Hs.20141 similar to S. cerevfeiae SSM4 

130777 130777 A W1 35049 Hs.26285 Homo sapiens cDNA FU 10643 hs, done NT 

106195 106196 AA525993 Hs.173699 ESTs, Weakly sbnflar to ALU1_HUMAN ALU S 

40 133200 133200 ABQ37715 Hs.183639 hypofoetical protein FU10210 

106328 106328 AL079559 Hs.28020 WAA0766 gene product 

106423 106423 AB020722 Hs.16714 Rho guanine exchange factor (GEF) 15 

439608 439508 AW864698 Hs.301732 hypothetica] protein MGC5306 

106503 106503 AB033042 Hs.29679 cofactor required for Sp1 transcriptiona 

45 106543 106543 AA676939 Hs.69285 neuropinnl 

106589 106589 AK000933 Hs.28661 Homo sapiens cDNA FU10071 fis, clone HE 

106598 106596 AA452379 ESTs, Moderately similar to ALU7JHUMAN A 

106636 106636 AW958037 Hs.286 rfbosomal protein L4 

131353 131353 AW754182 gb:RC2-CT0321 -131 199-01 1-c01 CT0321 Homo 

50 131710 131710 NM.015368 Hs^0985 pannexin 1 

131775 131775 AB014548 Hs^1921 K1AA0648 protein 

106773 106773 AA478109 Hs.188833 ESTs 

106B17 106817 D61216 Hs.18672 ESTs 

106848 106848 AA449014 Hs.121025 chromosome 1 1 open reading frame 5 

55 418S99 418699 BE539639 Hs.173030 ESTs, Weakly similar to ALU8JflJMAN ALU S 

130638 130638 AW021276 Hs.17121 ESTs 

107059 107059 BE614410 Hs.23044 RAD51 (S. cerevBtae) homobg (E coii Re 

107115 107115 BE379623 Hs.27693 pepfldylprolyl isomerase (cydophiiinH 

107156 107156 AA137043 Hs.9663 programmed cell death 6^iteracting prot 

60 130621 130621 AW513087 Hs.16803 LUC7 (S. cerevisiaeHike 

132626 132626 AW504732 Hs.21275 hypothetical protein FU1 1011 

131610 131610 AA357879 Hs^9423 scavenger receptor with C-type lecfin 

107295 107295 AA186629 Hs.80120 UDP-N-acetyWpra^alactosarrane:pc^ 

107315 107315 AA316241 Hs.90691 nudeophosmin/nudeoplasmin 3 

65 107328 107328 AW959891 Hs.76591 WAA0887 proteh 

134715 134715 U48263 Hs.89040 prapronocicepfin 

129938 129938 AW003668 Hs.1 35587 Human done 23629 mRNA sequence 

130074 130074 AL038596 Hs^50745 polymerase (RNA) ill (DNA directed) (62k 

132036 132036 AL157433 Hs37706 hypofteticaJ proteii DKFZp434E2220 

70 113857 113857 AW243158 Hs.5297 DKF2P564A2416 protein 

130419 130419 AF037448 Hs.155489 NSI-assodated protein 1 

132616 132616 BE262677 Hs283558 hypothetal protein PR01855 

132358 132358 NNL003542 Hs.46423 H4 histone family, member G 

125827 125827 NM.003403 Hs.97495 YY1 transcription factor 

75 107609 107609 R75654 Hs.164797 hypothetical protein FU13693 

107714 107714 AA015761 Hs.60542 ESTs 
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107832 107832 


AA021473 




124337 124337 


N 23541 


HSJ281561 


129577 129577 


N75340 


U_ 0nC4O4 

HS.3Q5121 


132000 132000 


AWZ47U17 


MsJba7o 


107935 107935 


AAU294Z8 


HS. 01030 


131461 131461 


AA99Zo41 


Hs.27263 


108029 108029 


AAU4Q74G 


HS.0ZUU7 


108084 108084 


AA058944 


HS.110oQZ 


108168 108168 


Alii CO 4 47 

AI453137 


HS.63176 


108189 108189 


AW376061 


Hs. 53335 


108203 108203 


AWB47814 


Hs.289005 


108217 108217 


AA058686 


Hs.62588 


108277 108277 


AA064859 




108309 108309' 


AA0B9818 




108340 108340 


AA069820 


1 1 4 oaaaA 

Hs. 180909 


108427 108427 


AA076382 




108439 108439 


A A A*lAftAA 

AA078986 




108469 108469 


AA079487 




108501 108501 


AA083255 




108562 108562 


AA1 00796 




130890 130890 


AI907537 


Hs.76598 


130385 130385 


AW067800 


Hs. 155223 


108807 108807 


A1652236 


Hs.49376 


108833 108833 


AF1 88527 


Hs.61661 


108846 108846 


AL1 17452 


Hs.44155 


131474 131474 


L46353 


Hs.2726 


108941 108941 


AA148650 




108998 108996 


AW995610 


Hs.332436 


131183 131183 


AI611807 


Hs.285107 


109022 109022 


AA157291 


Hs21479 


109068 109068 


AA1 64293 


Hs.72545 


129021 129021 


AL044675 


Hs.173081 


109146 109146 


AA176589 


Hs.1 42078 


131080 131080 


NMJJ01955 


Hs.2271 


109222 109222 


AA1 92833 


Hs.333512 


4nnxo4 4rtftyi04 
109481 109481 


A A Q7Q001 




109516 109516 


AI471639 


Hs.71913 


109556 109556 


A1925294 


Hs.87385 


109578 109578 


F02208 


Hs.27214 


109625 109625 


H29490 


Hs.22697 


109648 109648 


H17800 


Hs.7154 


109699 109699 


H1B013 


Hs.167483 


109933 109933 


R52417 


Hs.20945 


110039 110039 


H11938 


Hs.21907 



gb2e66c11.s1 Soares refina N2MHR Homo 
Homo sapiens cDNA: FU23582 6s, done L 
CDC20 (cell division cycle 20, S. cerevi 
melanoma antigen, family A, 3 
ESTs 

K1AA1458 protein 
ESTs 

Homo sapiens, clone MAG&41 54008, mRNA, 
ESTs 

ESTs, Moderately similar to A46010 X-Dn 
Homo sapiens cDNA: FU21532 fs, clone C 
ESTs 

gbzm50fl)3.s1 Stratagene fibroblast (937 
gb2m67e03n Stratagene neuroepithefium 
peroxiredoxin 1 

gb:zm91g08.s1 Stratagene ovarian cancer 
gbzm92h01.s1 Stratagene ovarian cancer 
gb2m97©8.s1 Stratagene colon HT29 (937 
gb2n08g12.s1 Stratagene nNT neuron (937 
gb2m25c06^1 Stratagene pancreas (93720 
stress-associated endoplasmic reticulum 
stanntocaldn 2 
hypothetical protein RJ20644 
ESTs, Weakly similar to AF174605 1 F-box 
DKFZP586G1517 protein 
higtHiiobfifiy group (nonhistone chromoso 
gbzo09e06.s1 Stratagene neuroepithefium 
EST 

hypofoeffcal protein FU 13397 

ubinuctein 1 

ESTs 

KIAA0530 protein 
EST 

endothelin 1 

similar to rat myomegaSn 

hypofoetical protein FU21016 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens clone 24993 mRNA sequence 
histone acetyttransferase 
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TABLE 2 A 

Table 2A shows the accession numbers for those pkeys lacking unlgenelD's for Table 2. The pkeys In Table 7 lacking unigeneiD's are represented wffhin 
Tables 1-6A For each probesel we have listed the gene cluster number from which the oBgonudeofides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarly using Clustering and 
Alignment Tools (DoubleTwist Oakland CaHfbmia). The Genbank accession numbers for sequences comprising each cluster are feted In the "Accession" 
column. 



Pkey: Unique Eos probeset Identifier number 

CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 



CAT Number Accession 



108469 
108501 
108562 
101300 



125565 
132983 



1 16761J AA079487 AA128547 AA128291 AA079587 AA079600 
13684.-12 AA083256 

35375,1 AA100796 AF020589 AA074629 AA075946M100849AAO*5347 AA1 26309 AA079311 AA079323 AA085274 

4669J BE535511 M62098 AA306787 AW891766 AA348998 AA338869 AA344013 AW956561 AW389343 AW403607 L40391 

AW408435 AA121738 AI568978 H13317 R20373 AW948724 AW948744 AA335023 AA436722 AA448690 C21404 
AW884390 AA345454 AA303292 AA174174 BE092290 T90814 AA035104 R76028 AA126924 AA741086 AW022056 
AW1 18940 AA121666 AI832409 AA683475 AI140901 A1623576 AW519064 AW474125 AI953923 AI735349 AW150109 
AI436154 AW118130 AW270782 AI804073 N27434 AA876543 AA937815 AI051166 AA505378 AI041975 AI335355 
AI089540 AA662243 AI127912 AI925504 A1250S80 AI366874 A1554386 AI815196 AI683526 A1435885 All 60934 H79030 
AI801493 AA448691 AI673767 AKJ76042 AI804327 AA813438 AA680002 A1274492 T16177 AI287337 AI935050 
AA907805 AA91 1493 AI58941 1 AI371358 AW576236 A1078856 AW516168 AA346372 AI560185 AA471009 R75857 
AA296025 AA523155 AA853168 A1696593 AI658482 AI566601 AW072797 AA128047 AA035502 AW243274 AA992517 
R43760 

117156 145392 1 W73853 AA9281 12 W77887 AW889237AA1 48524 AI749182 AI754442 AI338392 AI253102 A10T9403 AI370541 AI697341 
H97538 AW188021 A1927669 W72716 AI051402 A1188071 A1335900 N21488 AW770478 W92522 AJ691028 AI913512 
AI144448 W73819 AA604358 N28900 W95221 AI868132 H98465 AA148793 
1704098J R20840R20839 

11922.1 M30269NM 002508 X82245 AI078760AW957003D78945 M27445 AA650439 AL04&816 AV660256 AV660347 

AA333052 BE295257 T60999 AA383049 AW369677 Z26985 AW175704 AA343326 AW747957 AI818389 W17308 
W17302 H15591 AA371284 AA370412 W94966 BE384365 T28498 R80714 R16959 H21723 AWB35154 D56097 D56381 
W21232 AA190565 AW379755 AW067895 
133681 13893.1 A1352558 Z82248 X78138 NM003405 AU077248 AA223125 S80794 D78577 AH 24697 AW403970 BE614089 BE296713 

BE621334 L20422 XB0536 D54224 D54950 X57345 N29226 AA127798 AA340253 F08031 AA192540 H67636 AA321827 
AW950283 AA084159 BE538808 AW401377 AA256774 C03365 W46595 W47608 AA305009 H69431 H69456 AL120082 
H11706 AA303717 AA361357 H22042 H78020 AW999584 AA1 34368 AA322911 AA322961 H60980 N85248 N31547 
H79624 T1 1718 W85826 AW894663 AW894624 BE167441 BE170015 AA304626 AW602163 AW998929 AA156681 
AA151067 BE002724 AA608688 H82692 BE155392 AW383636 BE155394 AA487004 AW383504 AI342365 R82553 
W16498 BE155344 AI143938 R69901 AA322873AW340648 R25364 AA367935 AI559406 AA033522AA374252 
AW835019 A1922133 A1697089 N99662AW1 89078 AI1 99076 AW151 598 W59944 AA662875 W94022 AA299055 
AI039008 AI829449 AA583503 AB35674 AW131665 A1473820 AW2731 18 AW900930 AA908944 AI688035 AW170272 
AI082545 AW468176 AI608761 AI082748 A1911682 AI24B943 AI831016 AA192465 AI218477 AA938406 AA385288 
AI809817 AA905196 AI191245 AI470204 AI188296 AI421367 A1125315 AJ087141 AA629032 AA740589 AI554181 
AA150830 AI248541 A1077943 AA775958 AA864930 AC61476 AI123121 AI310394 AA862331 AA872478 BE537084 
AI205606 AA720684 A1872093 AW 150042 AL120538 AA219627 AA988508 C21397 A1359337 H25337 AI089749 
AA605146 AI359620 AA150478 AI359738 AW383642 AW995424 AI766457 R56892 AI089839 W61343 N59107 W46459 
AA565955 N2G527 AI279782 W46596 AA776573 H23204 AI866231 AI083995 N21530 AA126874 DS2530 W65437 
AI086917 AW382095 A1086877 H69844 AW340217 W85827 L08439 AA262704 AA505380 W47413 W94135 AA223241 
AW089153 AA084101 BE538000 AA096126 T28031 AA491574 R84813 AA774536 AW383522 AA 1556 15 AW383529 
AA491520 AW028427 AA171496 AI469689 AW664539 AI811102 AI811116 BE464590 BE350791 H78021 T15405 H21979 
AA219489 H13301 AA505883 A1864305 AI423963 AW084401 F04963 R69858 H67097 AI917740 AI655561 H69864 
AA033631 AW383484 AI886261 H25293 AA513281 AW271 187 H1 1 617 N79982 AI174338 A1904207 AI904208 BE614558 
W94127 W65436AI272249 M700018A1579932A1085941 AW152629 
279548J AA404418 AI217248 

18986.1 AA353093AW957317AW872498A1560785AI289110AW135512X97261T68873 

244391J AI743860 N49543 AW027759 BE349467 A16562B4 BE463975 R35022 AA370031 AW955302 AL0421Q9 N53G92 A1611424 
AL079362 A1969290 AI928016 BE394912 BE504220 BE467505 A1611611 AIS1 1407 AI61 1452 W56437 AI284566 
AI583349 AW183058 A1308085 AI074952 AA437315 AA628161 AW301728 AI150224 AA400137 AA437279 AI223355 
AA639462 A1261373 AW32414 AI984994 A1539335 AA401550 AA358757 AI609976 AA442357 AA359393 AA437046 
AA370301 AA429328 AW272055 A1580502 AIB32944 AI038530 AA425107 AI014986 Al 148349 AW237721 AW779756 
AW137877 AJ125293 AA400404 R28554 
111495J AA069818 AA069971 AA069923 AA069908 
genbank_AA021473 AA021473 
genbanJLM508588 AA608588 
genban)LC13961 C13961 
genbank_NSS845 N66845 
genbanlLAA027317 AA027317 

304084 J AI583948 AA578212 AW303715 AA653450 AA456981 AI400385 W88533 AI2241 33 AW272145 AA088686 R94698 
genbanlLW84768W84768 
genbanlLAA064859 AA064859 



121335 
130018 
121822 



108309 
107832 
123523 
123964 
118475 
104787 
106596 
113947 
108277 
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108427 genbankJ^A076382 AA076382 

108439 genban)eM078986 AA078986 

131353 231 SO J AW41 1259 H23555 AW015049 AJ684275 AW015886 AW068953 AW014085 AI027260 R52S86 AA918278 AH29462 

AA969360 N34869 AI948416 AA534205 AA702483 AA705292 

101447 entrez^M21305 M21305 

108931 genbanHJ^A147186 AA147186 

108941 genbanUW48650 AA148650 

103138 entre^.X65965 X65965 

1 19174 genbanLR71234 R71234 

119416 genbanKJ97186 T97188 

105985 genbank^A406610 AA408610 

100327 entre*J>55640 D55640 
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TABLE 3: 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



Pkey: Unique Eos pro beset identifier number 

Accession: Accession number used for previous patent filings 

ExAccn: Exemplar Accession number, Genbank accession number 

UnfcenelD: Unigene number 

Unigerre Title; Unigene gene title 



Pkey Accession ExAccn UniGene UnigeneTrtle 



100405 
100420 
100481 
100484 
100718 
100991 
101097 
101168 
101194 
101261 
101345 
101447 
101485 
101543 
101550 
101560 
101674 
101714 
101741 
101838 
101857 
102)12 
102024 
102164 
102241 
102283 
102303 
102564 
102663 
102759 
102778 
102804 
102887 
102898 
102915 
103036 
103037 
103095 
103158 
103166 
103185 
103280 
103554 
103850 
104465 
104592 
104764 
104786 
104850 
104865 
104894 
104952 
104974 
105178 
105253 
105330 
105376 
105729 
105626 
105977 
106008 
106031 
106124 



D86425 



HG1G984iT1098 

HG1103-HT1103 

HG3342-HT3519 

J03764 

L06797 

L15388 

L20971 

L35545 

L76380 

M21305 

M24736 

M31166 

M31551 

M32334 

M61916 

M68874 

M74719 

M92934 

M94856 

U03057 

U03877 

U18300 

U27109 

U31384 

U33053 

U59423 

U70322 

U81607 

U83463 

U89942 

X04729 

X06256 

X07820 

X54925 

X54936 

X60957 

X67235 

X67951 

X69910 

X79981 

Z18951 

AA187101 

N24990 

R81003 

AA025351 

AA027168 

AA040465 

AA045136 

AA054087 

AA071089 

AA085918 

AA187490 

AA227926 

AA234743 

AA236559 

AA292694 

AA398243 

M406363 

AA411465 

AA412284 

AA423987 



AW291587 Hs.82733 
D86983 Hs.1 18893 
X70377 Hs.121489 
NM_005402Hs^88757 
BE295928 Hs.75424 
J03836 Hs.82085 
BE245301 Hs.89414 
NWL005308Hs^11569 
L20971 Hs.188 
D30857 Hs.82353 
NM_005795Hs.152175 
M21305 

AA296520 Hs.89546 
M31166 Hs.2050 
Y00630 Hs.75716 
AW958272 Hs.347326 
NM_002291Hs.82124 
M68874 Hs.211587 
NWL003199HS.326198 
BE243845 Hs.75511 
BE550723 Hs.153179 
BE259035 Hs.1 18400 
AA301867 Hs.76224 
NM_000107Hs.77602 
NM.007351HS.268107 
AW161552 Hs.83381 
U33053 Hs.2499 
U59423 Hs.79067 
NWL002270HS.168075 
NNL005100Hs.788 
AF0Q0652 Hs.8180 
NM_002318Hs.83354 
J03836 Hs.82085 
NIVL002205HS.149609 
X07820 Hs.2258 
M13509 Hs.83169 
BE018302 Hs.2894 
NM.005424HS.78824 
BE242587 Hs.1 18651 
AA159248 Hs.180909 
NM.006825HS.74368 
U84722 Hs.76206 
A1878826 Hs.74034 
AA187101 Hs.213194 
Z44203 Hs36418 
AW630488 Hs.25338 
AI039243 Hs.278585 
M027167 Hs.10031 
AL133035 Hs.8728 



T79340 
AF065214 



HsJ22575 
Hs.18858 



AW076098 Hs345588 
Y12059 Hs.278575 
AA313825 Hs.21941 
AW388633 Hs.6682 
AW338625 Hs.22120 
AW994032 Hs.8768 
H46612 Hs.293815 
AA478756 Hs.194477 
AK001972 Hs.30822 
AB033888 HS.B619 
X64116 Hs.171844 
H93366 Hs.7567 



nldogen2 

Melanoma associated gene 
cystatin D 

v-ral simian leukemia viral oncogene horn 
inhibitor of DNA binding 1 , dominant neg 
serine (or cysteine) proteinase inhibtto 
chemoWne (C-X-C mofif), receptor 4 (fus 
G proteirnxxipted receptor kinase 5 
phosphodiesterase 48, cAMP^pecific (dun 
protein C receptor, endothelial (EPCR) 
calcitonin receptor-like 
gb:Human alpha satellite and satellite 3 
selecfin E (endothelial adhesion motecul 
pentaxin-related gene, rapidly induced b 
serine (or cysteine) proteinase inhiblto 
Intercellular adhesion molecule 2 
laminin, beta 1 

phosphoGpase A2, group iVA (cytosolfc, 
transcription factor 4 
connective tissue growth factor 
fatty acid binding protein 5 (psoriasis- 
singed (OrosophiIa)-Qke (sea urchin fas 
EGF-containing fibuWke extraceilula 
damage-specific DNA binding protein 2 (4 
multimerin 

guanine nudeottde binding protein 11 

protein kinase C-fike 1 

MAD (mothers against decapentapiegic Dr 

karyopherln Omporfin) beta 2 

A kinase (PRKA) anchor protein (gravin) 

syndecan binding protein (syntenin) 

lysyloxidase-iike2 

serine (or cysteine) proteinase inhfcito 
integrin, alpha 5 (fibronecfin receptor, 
matrix metaltoproteinase 10 (stromeiysin 
matrix metaSoproteinase 1 (interstitial 
placenta) growth factor, vascular endoth 
tyrosine kinase with immunoglobulin and 
hematopoieticany expressed homeobox 
peroxiredoxin 1 

transmembrane protein (63kD), endoplasmi 
cadherin 5, type 2, VE-cadherin (vascuta 
caveoBn 1, caveotae protein, 22kD 
hypothetical protein MGC10895 
ESTs 

protease, serine, 23 
ESTs 

WAA0955 protein 

hypothetical protein DKFZp434G171 
B-ceD Cmiymphoma 6, member B (zinc fi 
phospholfoase A2, group IVC (cytosofic, 
desrnoplaxin (DPI, DPIO 
bromodomain-containing 4 
AD036 protein 

solute carrier family 7, (cafionic amino 
ESTs 

hypoftetfcal protein FU10849 

Homo sapiens HSPC2S5 mRNA, partial cds 

E3 ubiqultin iigase SMURF2 

hypoftetlcai protein FU11110 

SRY (sex determining region Y>box 18 

Homo sapiens cDNA: FU22296 fs, done H 

Homo sapiens cDNA: FU21962 (is, done H 
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106155 


AA425309 


AA425414 


Hsi3287 


106302 


AA435896 


AA398859 


Hs.18397 


106423 


AA448238 


AB020722 


Hs.16714 


106793 


AA478778 


H94997 


Hs.16450 


107174 


AA621714 


BE122762 


Hs.25338 


107216 


D51069 


D51069 


Hs^11579 


107295 


T34527 


AA186629 


Hs.60120 


107385 




NM-005397HS.16426 


1(8756 


AA1 27221 


AA127221 


Hs.1 17037 


108846 


AA139QR3 


AL1 17452 


Hs.44155 


108688 


MM IOOOUQ 


AA135606 


Ks.189384 


109001 


AAlSfilTS 

MA I3D 1 l-*J 


AI056548 


Hs.72116 


109166 


AA17Qftd5 

MM 1/ 9QHO 


AA219691 


Hs.73625 






AW956580 Hs,42699 


109768 


F 10399 

r IU099 


F06838 


Hs.14763 


110107 


H1S772 


AW151660 Hs.31444 


110906 


N39584 


AA035211 


Hs.17404 


110984 


N5200G 


AW613287 Hs.80120 


111006 


N53375 


BE387014 


Hs.166146 


111018 


N 54067 


AI287912 


Hs.3628 


111133 


N64436 


AW580939 Hs.97199 


111760 


R26892 


BE551929 


Hs.268754 


113073 


T33R37 

1 3003' 


N39342 


Hs.103042 


1131QA 
MO 1 93 


■PT7119 
I DM \C 


H83265 


Hs.8881 




vvou/oo 


AW953484 Hs.3849 




MMlWOotfO 


AW139036 Hs.108957 


1 I3U0I 


AA953917 

MMIOOZ 1/ 


AI751438 


Hs.41271 


1 IJvoO 


MMZ3399 1 


AI683069 


Hs.175319 




AA95R13R 

MMZ30 IOO 


AA740907 


Hs.88297 


1 13019 


AAA9fi573 
MMH£03/3 


AA486620 


Hs.41135 


115Q47 

1 139*// 


AA4437Q3 


R47479 


Hs.94761 


11S314 

1 103 IH 


MMH3U300 


AI799104 


Hs.178705 


116339 


AA496257 


AK000290 


Hs.44033 


11&&3fl 


AARM717 

MMOU3/ I / 


AK001531 


Hs.66048 


■ I03O9 


M393/U 


AI557212 


Hs.17132 


11R733 
1 10/ oo 


F 13787 
r 13/ o/ 


AL157424 


Hs.61289 


1 M\)£3 


noo 1 3/ 


AW070211 Hs.102415 


111 100 




H98988 


Hs.42612 


III 000 


M349A7 


AF055634 


Hs.44553 


11/83/ 




N52090 


Hs.47420 


110n/0 


nOOO**0 


N66845 




11000 1 


IN009U3 


N68905 




nomn 

1 13U/ J 




BE245360 


Hs^79477 


1 19 133 


R61715 

r>w 1/13 


R61715 


Hs.310598 


11Q174 

1 19 l/H 


R71234 


R71234 






R98105 


C14322 


Hs^50700 


11Q416 

1 I9*r IO 


T97186 


T97186 




1 19000 


W60814 


AA496205 


Hs.193700 


191335 


AA404418 


AA404418 




1913R1 


AA4Q5747 

MM*tU3/*f f 


AW088642 Hs.97984 


izo iou 


AA4RARR7 


AA488687 Hs.284235 


1 Ti4T3 


MM399 InO 


AA599143 




1O0C9^ 

l&JOZJ 


MMDUO3O0 


AA6085B8 




l/OOOO 


AAfiftR7«\1 

MMDwO/3 • 


AA608751 




100QC4 
li03M 


P13QR1 

\s 1390 1 


C13961 




io/nnR 


Dfin3fl2 

L/UU3VC 


AI147155 


Hs.270016 


1 Z431 3 


UOAQQO 
rl9*t09£ 


NM_0054O2Hs288757 


l£H009 


P(903£ I 


AI680737 


Hs.289068 


l£*)009 


N95477 


A1571594 


Hs.102943 




R60044 


W07701 


Hs304177 




R70506 


AK87664 


Hs^85814 




T91518 


T91518 




1M1A3 

lb) IW 


T95333 


AA570056 Hs. 122730 


icoooo 


R45630 


R60547 


Hs.170098 


19CCCC 
I £3303 


R20839 


R20840 




IZ339U 


.R23858 


R23858 


Hs.143375 


19CC11 
1^0011 




T92143 


Hs.57958 


126563 


VV26247 


AA516391 


Hs.181368 


126649 


AA856990 


AA001880 


Hs.279531 


126672 


AA136653 


AW450979 




127402 


AA358869 


AA358869 


HSJ227949 


127651 


AJ123976 


AA382523 


Hs.105689 


127759 


A869384 


A1369384 


Hs.292441 


128062 


AA379500 


AA379621 


Hs.105547 


128992 


R49693 


H04150 


Ks.107708 


129046 


AA195678 


A6029290 Hs. 108258 



nuclear factor l/B 

hypoMcal protein FU 23221 

Rho guanine exchange factor (GEF) 15 

ESTs 

ESTs 

melanoma cefl adhesion moleculB 

UDP^^tyl^^[>^alactosaniine:poryp 

pcxJcicaiyxin-Gke 

ESTs 

DKFZP586G1517 protein 

gb2t10a05^1 SoaresjiregnartLutenis_NbH 

hypothetical protein FU20992 similar to 

RAB6 interacting, IdnesMce (rabktnes 

ESTs 

ESTs 

ESTs 

ESTs 

UDP44-aretyi^ha-T>gaiactc^mine:pofyp 
Homer, neuronal immediate early gene, 3 
rnitogen-activated protein kinase kinase 
complement com^nent C1q receptor . 
Homo sapiens cONA FU1 1949 fis, done HE 
rrtotiibule-associated protein 1B 
ESTs, WeaWy similar to S41 044 <^romosom 
hypothetical protein FU22041 similar to 
40S ribosomal protein S27 isoform 
Homo sapiens mRNA full length Insert cON 
ESTs 
ESTs 

endocnucin-2 
WAA1691 protein 

Homo sapiens cONA FU1 1333 fis, done PL 



hypoftetical protein FU10669 

ESTs, Moderately similar to 154374 gene 

synaptojanin 2 

Homo sapiens mRNA; cONA DKFZp586N0121 (f 
ESTs, WeaWy similar to ALU1_HUMAN ALU S 
unc5 (Celegans homotog) c 
EST 

gbza46tf 1 ^1 Scares fetal Over spleen 
gbza69b09^1 Soares^taLiungJibHL19W 
ESTs 

ESTs, Moderately similar to ALU1.HUMAN A 
gbyi54c08.s1 Scares placenta Nb2HP Homo 
(ryptase beta 1 

gb:ye50h09.s1 Scares fetal Over spleen 
Homo sapiens mRNA; cDMA DKFZp586l0324 (f 
gb2»37e02^1 Soares.totaLfetus JJb2HF8. 
hypothetical protein FU22252 similar to 
ESTs, Weakly similar to 138022 hypometi 
gbae52d04^1 Stratagene lung carcinoma 
gb:ae54e06^1 Stratagene lung carcinoma 
gb:ae56h07.s1 Stratagene lung carcinoma 
gb:C13961 Ctontech human aorta poryA+ mR 
ESTs 

\wal simian leukemia viral oncogene horn 

Homo sapiens cDNA FU11918 fis, done HE 

hypothetical protein MGC12916 

Homo sapiens done FLB8503 PR02286 mRNA, 

sprouty (DrosopfiQa) homotog 4 

gb7eMJ5^1 Stratagene lung (937210) H 

ESTs, Moderately similar to WAA1215 pro 

K1AA0372 gene product 

gtryg05c08j1 Scares infant brain 1N1B H 

Homo sapiens, clone IMAG&3840937, mRNA, 

E<^-TM7-iatroph&HBlated protein 

U5 snRNP-specffic protein (220 kD), orfh 

ESTs 

gb^W«l3-ala-a-12-(Mil^1 NCI_CGAP.Su 
SEC13 (S. cerevisiae)-Ske 1 
MSTP031 protein 
ESTs 

neural proGferafion, (fifferentlafion an 
ESTs 

acfin binding proteh; macrophin (rmcrof 
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129188 M30257 
13314 AA028131 
129371 M10321 
129468 J03040 
129765 M86933 
129805 AA012933 
129884 AA286710 
130495 AA243278 
130639 059711 
130657 T94452 
130828 AA053400 
130972 AA370302 
131080 J05003 
131137 U85193 
131182 AA256153 
131486 X83107 
131573 AA046593 
131647 AA410480 
131756 D45304 
131859 M90657 
131881 AA010163 
132050 AA136353 
132083 Y078S7 
132164 U84573 
132358 X60485 
132413 AA132969 
132456 AA1 14250 
132490 F13782 
132676 AA283035 
132687 AB002301 
132718 AA056731 
132736 U68019 
132760 H99198 
132933 AA598702 
132968 N77151 
132994 AA505133 
133061 AB000584 
133147 D12763 
133161 AA253193 
133200 AA432248 
133260 AA083572 
133363 AA479713 
133491 L40395 
133517 X52947 
133550 W80846 
133607 M34539 
133614 D67029 ' 
133627 U09587 
133691 M85289 
133696 D10522 
133913 W84712 
133975 D29992 
133985 L34657 
134039 S7B569 
134088 043636 
134161 U97188 
13439 AA487558 
134416 M28882 
134453 X70583 
134656 X14787 
134989 AA236324 
135051 C15324 
135073 AA452000 
135349 DB3174 
100114 000596 
100130 D11428 
100143 013640 
100168 014874 
100208 D26129 
100224 028476 
100405 D86425 
100420 086983 
100455 D87953 
10053 HG1862-HT1897 

100618 HG2614-HT2710 

100619 HG2639-HT2735 



NMJHJ1078HS.109225 
BE622768 Hs.290355 
X06828 Hs.110802 
AW410538 Hs.111779 
M86933 Hs.1238 
AA012848 Hs. 12570 
AF055581 Hs.13131 
AW250380 Hs.109059 
AI557212 Hs.17132 
AW337575 Hs.201591 
AW631469 Hs.203213 
D81866 Hs.21739 
NMJ)01955Hs.2271 
W27392 Hs.33287 
AI824144 Hs.23912 
F06972 Hs.27372 
AA040311 Hs.28959 
AA359615 Hs.30089 
AA443966 Hs.31595 
AW960564 
AW361018 Hs.3383 
A1267615 Hs.38022 
BE386490 Hs.279663 
AT752235 Hs.41270 
NM_003542Hs.46423 
AW361383 Hs.260116 
AB011084 Hs.48924 
NMJ»1290Hs.4980 
N92589 Hs.261038 
AB002301 Hs.54985 
NM_004600Hs.554 
AW081883 Hs.211578 
AA125985 Hs.56145 
BE263252 Hs.6101 
AF234532 Hs.61638 
AA1 12748 Hs.279905 
AH86431 Hs^96638 
AA026533 Hs.66 
AW021103 Hs.6631 
AB037715 Hs.183639 
AA403045 Hs.6906 
AI866286 Hs.71962 
BE619053 Hs.170001 
NM_000165Hs74471 
AJ129903 Hs.74669 
BE273749 

NWL003003HS.75232 
NWL002047HS.75280 
M85289 Hs.211573 
AIB78921 Hs.75607 
AU076964 Hs.7753 
C18356 Hs295944 
L34657 Hs.78146 
NM_002290Hs.78672 
AI379954 Hs.79025 
AA634543 Hs.79440 
AW580939 Hs.97199 
X68264 Hs^11579 
A1272141 Hs.83464 
AI750878 Ks.87409 
AW968058 Hs.92381 
A1272141 Hs.83484 
W55956 Hs.94030 
AA1 14212 Hs.9930 
X02308 Hs.82962 
NM.0003O4HS.103724 
AU076465 Hs.278441 
H73444 Hs.394 
NMJJ02933HS.78224 
AL121516 Hs.138617 
AW291587 Hs.82733 
D86983 Hs.1 18893 
AW888941 Hs.75789 
BE313693 Hs^34330 
AI752163 Hs.114599 
N24433 Hs^41567 



vascular cell adhesion molecule 1 
mesoderm development candidate 1 
von Wiflebrand factor 
secreted protein, acidic, cystelne-rich 
ametogenln (Y chromosome) 
tubufivspecffic chaperone d 



mitochondrial ribosoma) protein L12 
ESTs, Moderately similar to I54374 gene 
ESTs 
ESTs 

Homo sapiens mRNA; cDNA DKFZp586H518 (f 
endotheEn 1 
nuclear factor !/B 
ESTs 

BMX non-receptor tyrosine kinase 

ESTs 

ESTs 

ESTs 

transmembrane 4 superfamHy member 1 
upstream regulatory element binding prot 
ESTs 
Pirin 

procollagen-lysine, 2-oxoglutarate 5-dio 
H4 histone family, member G 
metalbprotease 1 (pftrflysin family) 
K1AA0512 gene product ALEX2 
UM domain binding 2 
ESTs, WeaWy similar to 138022 hypothell 
K1AA0303 protein 

Sjogren syndrome antigen A2 (60kD, ribon 
Homo sapiens cONA: FU23037 fe, clone L 
thymosin, beta, identified in neuroblast 
hypofrefical protein MGC3178 
myosin X 

clone HQ0310PRO0310p1 
prostate differentiation factor 
interleukin 1 receptor-like 1 
hypothefical protein FU20373 
hypothetica] protein RJ10210 
Homo sapiens cDNA: FU23197 fis. clone R 
ESTs, Weakly similar to B36298 proline-r 
eukaryotic translation initiation factor 
gap Junction protein, alpha 1, 43kO (con 
vesicle-associated membrane protein 5 (m 
FK5Q64xnding protein 1A (12kO) 
SEC14(S.cerevJslaeHike1 
glycyMRNA synthetase 
heparan sulfate proteoglycan 2 (periecan 
myristoylated aianine-nch protein kinas . 
calumenin 

tissue factor pathway inhibitor 2 
platetet/endotheSal cell adhesion mo tec 
laminin, alpha 4 
WAAQ096 protein 
IGF-ll mRNA-bkfing protein 3 
complement component C1q receptor 
melanoma cell adhesion molecule 
SRY (sex detemnining region Y>*ox 4 
trirombospondin 1 

nudbc (nucleoside diphosphate finked mol 
SRY {sex determining region Y)-box 4 
Homo sapiens mRNA; cDNA DKFZp586E1624 (f 
serine (or cysteine) proteinase inhibito 



peripheral myelin protein 22 
K1AA0015 gene product 
adrenorneduQIn 

rtoonuctease, RNase A family, 1 (pancrea 
thyroid hormone receptor interactor 12 
nkfogen2 

Melanoma associated gene 

N-myc downstream regulated 

calmodulin 2 (phosphorytase kinase, deit 

collagen, type VM, alpha 1 

RNA binding moS, single stranded inter 
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100658 HG2855tfT2995 
100576 HG3044-KT3742 
100718 HG3342-HT3519 
100752 HG3543-HT3739 
5 100828 HG40694fT4339 
100850 HG417-HT417 
100991 J03764 
101097 L06797 
101110 L08246 

10 101142 L12711 
101156 L13977 
101168 L153B8 
101184 L19871 
101192 L20859 

15 101317 L42176 
101335 L49169 
101345 L76380 
101400 M15990 
101475 M23254 

20 101485 M24736 
101496 M26576 
101505 M27396 
101543 M31166 
101557 M31994 

25 1015S0 M32334 
101587 M35078 
101592 M36429 

101633 M57730 

101634 M57731 
30 101667 M60858 

101682 M62994 
101714 M68874 
101720 M69043 
101741 M74719 
35 101744 M75126 
101793 M84349 

101837 M92843 

101838 M92934 
101840 M93056 

40 101857 M94856 

101864 M95787 

101931 S76965 

101966 $81914 

102012 U03057 
45 102013 U03100 

102024 U03877 

102059 U08021 

102121 U14391 

102283 U31384 
50 102300 U32944 

102378 U40369 

102395 U41767 

102460 U48959 

102491 U51010 
55 102499 U51478 

102523 U53445 

102560 U59289 

102564 U59423 

102589 U62015 
60 102600 U63825 

102645 U67963 

102687 U73379 

102693 U73824 

102709 U77604 
65 102759 U81607 

102804 U89942 

102882 X04412 

102907 X05985 

102915 X07820 
70 102927 X12876 

102960 X15729 

103011 X52541 

103020 X53416 

103029 X54489 
75 103036 X54925 

103056 X57205 



U56725 Hs.180414 

X02761 Hs.287820 

BE295928 Hs.75424 
T81^)9 

AL048753 Hs.303649 

AA836472 Hs.297939 



J03835 
BE245301 
AI439011 
L12711 



Hs.82085 
Hs.89414 
Hs.86386 
Hs.89643 



AA340987 Hs.75693 
NH.005308Hs^11569 
NMJ)01674Hs.460 
BE247295 Hs.78452 
L42176 Hs.8302 
NM.006732HS.75678 
NM.005795HS.152175 
M15930 Hs.194148 
BE410405 Hs.76288 
AA296520 Hs.89546 
X12784 Hs.119129 
AA307680 Hs.75692 
M31166 Hs.2050 
BE293116 Hs.76392 
AW958272 Hs.347326 
AI752416 Hs.77326 
AF064853 Hs.91299 
NNL004428HS.1624 
AV650262 Hs.75765 
NM.005381 
AF043045 Hs.81008 
M68874 Hi211587 
M69043 Hs.81328 
NMJ>03199Hs.326198 
AI879352 Hs.1 18625 



W01076 
M92843 



Hs.278573 
Hs.343586 



BE243845 Hs.75511 
AA236291 Hs.183583 
BE550723 Hs.153179 
BE392588 Hs.75777 
NWL006823HS.75209 
X96438 Hs.76095 
BE259035 Hs.1 18400 
BE616287 Hs.178452 
AA301867 Hs.76224. 
AI752666 Hs.76669 
NJVI004998HS.82251 
AW161552 Hs.83381 
AI929721 Hs.5120 
AU076887 Hs^8491 
AU077005 Hs.92208 
U48959 Hs.211582 
U51010 

BE243877 Hs,76941 
U53445 
R97457 
U59423 



Hs.15432 
Hs.63984 
Hs.79067 



AU076728 Hs.8867 
AI984144 Hs.66713 
AL1 19566 Hs.6721 
NIVL007019HS.93002 
AA532780 Hs.183684 
AA122237 Hs.81874 
NMJ)05100Hs.788 
NM.002318HS.83354 
AI767736 Hs.290070 
BE4Q9861 Hs.202833 
XD7820 Hs.2258 
BE512730 Hs.65114 
AI904738 Hs.76053 
AJ243425 Hs.326035 
X53416 Hs.195464 
AW800726 Hs.789 
M13509 Hs.83169 
Y18024 Hs.76877 



heat shock 70kD protein 2 
fibronecfin 1 

inhibitor of DMA binding 1, dominant neg 
insuBn-fike growth factor 2 (sorratomedl 
small inducible cytokine A2 (monocyte ch 
cathepsin B 

serine (or cysteine) proteinase InhMo 
chemokine (OX-C mofit), receptor 4 (fus 
myeloid ceO leukemia sequence 1 (BCL2-r 
transketobse (Werrucke-Korsakoff syndro 



G protein-coupled receptor kinase 5 
activating transcription factor 3 
sotute carrier family 20 (phosphate tran 
four and a half UM domains 2 
FBJ murine osteosarcoma viral oncogene h 
calcitonin receptor-Eke 
v-yes-1 Yamaguchi sarcoma viral oncogene 
calpah 2, (m/li) targe subunit 
setectin E (endothelial adhesion molecut 
collagen, type IV, alpha 1 



penlaxin-reiated gene, rapidly induced b 
aldehyde dehydrogenase 1 family, member 
intercellular adhesion molecule 2 
insuGn-Uke growth factor binding prote 
guanine nucleotide binding protein (G or 
ephrin-A1 
GR02 oncogene 
nucteofin 

fiJamki B, beta (actin-bfridlng prcteln-2 
phosphofipase A2, group IVA (cytosoBc, 
nuclear factor of kappa fight porypeptid 
transcription factor 4 
hexokinase 1 

CD59 antigen p18-20 (antigen identified 
zinc finger protein homologous to Zfp-36 
connective tissue growth factor 
serine (or cysteine) proteinase inhibito 
tatty acid binding protein 5 (psoriasis* 



immediate early response 3 
singed (DrosopMaHIke (sea urchin fas 
catenin (cafcb-assodated protein), a 
EGF-containing fibuGn-like extraceOuta 
nicotinamide N-methyllransferase 
myosin !E 

guanine nucleotide binding protein 11 
dynein , cytop&smic, Gght polypeptide 
spermidine/spermine N1-aceryltransferase 
a disintegrin and metafbproteinase doma 
myosin, light polypeptide kinase 
gbJHuman nicotinamide N-methytiransferas 
ATPase, Na-»/K+ transporting, beta 3 poiy 
downregulated in ovarian cancer 1 
cadherin 13, H-cadherin (heart) 
MAO (mothers against decapentaptegic, Or 
cysteine-rich, angiogenic inducer, 61 
hepatitis delta antigen-interacting prot 
lysosomal 

ubiqultin carrier protein E2-C 
eukaryotic translation inftiafion factor 
microsomal glutathione S-transferase 2 
A kinase (PRKA) anchor protein (g ravin) 
lysy) oxidasemke 2 
gelsoDn (amyloidosis, Rnrush type) 
heme oxygenase (decycCng) 1 
matrix metaDoproteinase 10 (stromelysin 
keratin 18 

DEAD/H (Asp-G&Mia-Asp/His) box potypep 
eaitygrowm response 1 
fflamin A, alpha (acfin-binding protein- 
GR01 oncogene (melanoma growth sfimuJafi 
matrix metaltoproteinase 1 (interstitial 
inositol 1 ,434nsphosphate 3-kinase B 
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103080 X59798 
103095 X60957 
103138 X55955 
103176 X69111 
103195 X70940 
103347 X87838 
103371 X91247 
103432 X97748 
103471 Y00815 
103967 AA303711 
104447 L44538 
104764 AA025351 
104783 AA027050 
104798 AA029462 
104865 AA045136 
104877 AA047437 
104894 AA054087 
104952 AA071089 
105113 AA156450 
105178 AA187490 
105196 AA195031 
105215 AA205724 
105263 AA227926 
105271 AA227986 
105330 AA234743 
105461 AA253216 

105492 AA256210 

105493 AA256268 
105594 AA279397 
105727 AA292379 
105732 AA292717 
105767 AA346551 
105882 AA400292 
105936 AA404338 
106031 AA412284 
106124 AA423987 
106222 AA428594 
106241 AA430108 

106263 AA431462 

106264 AA431470 
106366 AA443756 
106454 AA449479 
106634 AA459916 
106724 AA465226 
106793 AA478778 
106799 AA479037 
106842 AA482597 
106868 AA487561 
106890 AA489245 
106961 AA504110 
106974 AA520989 
107030 AA599434 
107061 AA603649 
107066 AA609519 
107216 D51069 
107385 U97519 
107444 W28391 
107985 AA035638 
108507 AA083514 
108695 AA121315 
108931 AA147186 
109001 AA156125 
109195 AA188932 
109390 AA219653 
109456 AA232645 
109737 F10078 
110411 H48032 
110660 H82117 
110906 N39584 
11101B N54067 
111091 N59858 
111356 N90933 
111378 N93764 
111741 R26124 
111769 R27957 
112318 R55470 



AU077231 Hs.82932 
NM_005424Hs78824 
X65965 

AL021154 Hs.76884 
AA351647 Hs2642 
AU077309 Hs.171271 
X91247 Hs.13046 
X97748 

Y00815 Hs.75216 
AL120051 Hs.144700 
AW204145 Hs.156044 
A1039243 Hs.278585 
AA533513 Hs.93659 
AW952619 Hs.17235 
T79340 Hs.22575 
AI138635 Hs.22968 
AF065214 Hs.18858 
AW076098 Hs.345588 
AB037816 Hs.8982 
AA313825 Hs.21941 
W84893 Hs.9305 
AA205759 Ha 101 19 
AW388633 Hs.6682 
AA807881 Hs.25329 
AW338625 Hs.22120 
BE539071 Hs.69388 
AI805717 Hs.289112 
AL047586 Hs.10283 
AB024334 Hs.25001 
AL135159 Hs^0340 
AW504170 Hs.274344 
AW370946 Hs.23457 
W45802 Hs.81988 
A1678765 Hs.21812 
X64116 Hs.171844 
H93366 Hs.7567 
AA356392 Hs.21321 
BE019681 Hs.6019 
W21493 Hs.28329 
AL046859 Hs^407 
AA186715 Hs.336429 
NM.014038Hs.5216 
W25491 Hs.288909 
N48670 Hs.28631 
H94997 Hs.16450 
BE313412 HsJ961 
AF124251 Hs.26054 
BE185536 Hjl301183 
AA489245 Hs£8500 
AW243614 Hs.18063 
A1817130 Hs.9195 
AL1 17424 Hs.25035 
BE147611 Hs.6354 
NM_012331Hi26458 
D51069 HS211579 
NM.005397HS.16426 
W28391 Hs.343258 
T40054 Hs.71968 
A1554545 Hs.68301 
AB029000 Hs.70623 
AA147186 

AI056548 Hs.72116 
AF047033 Hs. 132904 
AW007485 Hs.87125 
AW956580 Hs.42699 
AA055415 Hs.13233 
AW001579 Hs.9645 
AA782114 Hs^8043 
AA035211 Hs.17404 
AI287912 Hs.3628 
AA300067 Hs.33032 
BE301871 Hs.4857 
AW160993 Ha326292 
ABQ20653 Hs£4024 
AW629414 Hs^4230 
AW083384 Hs.11067 



cycGn D1 (PRAD1 : parathyroid adenomatos 

tyrosine Unasewtth immunoglobulin and 

gb:H.saptens SOD-2 gene for manganese su 

inhibitor of DNA binding 3, dominant neg 

eukaryotfc transfafion elongation factor 

catanin (cadhsrri -associated protein), b 

tWoredoxln reductase 1 

gb:Rsap)ens PTX3 gene promoter region. 

protein tyrosine phosphatase, receptor t 

ephrin-B1 

ESTs 

ESTs 

protein disulfide feomerase related pro* 

Homo sapiens done TCCCIA00176 mRNA sequ 

B-cefl CLL/iymphorna 6, member B (zinc fi 

Homo sapiens clone (MAGE:451939, mRNA se 

phosphoOpase A2, group IVC (cytosoQc, 

desrraplaktn (DPI, DPII) 

Homo sapiens, done IMAGE:35G6202, mRNA, 

AD036 protein 

angiotensin receptor-Qke 1 

hypoftefical protein FU14957 

solute carrier family 7, (cafionic amkio 

ESTs 

ESTs 

hypoftetical protein RJ 20505 

CGW3 protein 

RNA binding motif protein 8B 

tyrosine 34rioncoxygenase/tryptophan 5-mo 

KIAA1002 protein 

hypofoefcal protein MGC12942 

ESTs 

disabled (DrosophGa) hornotog 2 (mitogen 
ESTs 

Homo sapiens cDNA: FU22296 fis, done H 

Homo sapiens cDNA: FU21962 its, done H 

Homo sapiens done FLB9213 PR02474 mRNA, 

Homo sapiens cDNA: FU21288 fis, done C 

hypothetical protein FU14005 

protein kinase (cAMP-dependent, cataJyrj 

RIKEN cDNA 9130422N19 gene 

HSPC028 protein 

hypoMcaJ protein FU22471 

Homo sapiens cDNA: FU22141 fc, done H 

ESTs 

Homo sapiens clone 25012 mRNA sequence 
novel SH2 -containing protein 3 
molecule possessing ankyrin repeats indu 
mitogen-actrvated protein kinase 8 inter 
Homo sapiens cDNA FU 10768 fe, done NT 
Homo sapiens cDNA FU1 3698 fis, done PL 
chloride intracellular channel 4 
stromal cell derived factor receptor 1 
methionine sulfoxide reductase A 
melanoma cell adhesion molecule 
podocatyxin-Tike 

proSferafion-assodated 2G4, 38M) 

Homo sapiens mRNA; cDNA DKFZp564F053 (fr 

ESTs 

K1AA1077 protein 

gbzo38d01.s1 Stratagene endothelial eel 
hypothetical protein RJ 20992 similar to 
solute earner family 4, sodium bicarbon 
EH-domain containing 3 
ESTs 

ESTs, Moderately similar to A47582 B-cel 
Homo sapiens mRNA for WAA1741 protein, 
ESTs 
ESTs 

iTftogen^ctivated protein kinase kinase 
hypothetical protein DKFZp434N185 
maxmosyl (alpha-1 glycoprotein beta- 
hypothefeal gene DKFZp434A1 1 14 
K1AA0846 protein 
ESTs 

ESTs, Highly simDar to T46395 hypothefi 
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112951 T16550 
113057 T26674 
113195 T57112 
113490 T88700 
113542 T90527 
113803 W42789 
113847 W60002 
113910 W78175 
113947 W84768 
114047 W94427 
115061 AA253217 
115819 AM26573 
115870 AA432374 
115984 AA446622 
116228 AA478771 
116264 AA482594 
116314 AA490588 
116589 D59570 
117023 H88157 
117112 H94648 
117156 H97538 
117176 H98670 
117280 N22107 
119559 W38197 
119866 W80814 
120655 AA287347 
121314 AA4Q2799 
121335 AM04418 
121822 AA425107 
121835 AA425435 
122331 AA442B72 
122577 AA452860 
123160 AM886S7 
123486 AA599674 
124059 F13673 
124339 H99093 
124358 N22495 
124364 N23031 
124726 R15740 
124763 R39610 
125167 W45560 
125304 Z39833 
125307 Z40583 
1253a AA825437 
125598 R66613 
125609 AA868063 
418245 AA128075 
127435 N66570 
127566 AI051390 
127619 AA627122 
128453 X02761 
128495' AF010193 
128515 AA149044 
128560 U82108 
128623 078676 
128642 L35240 
128669 AA598737 
128903 R69417 
128914 AA232837 
129087 N72695 
129188 M30257 
1292% M96843 
129265 X68277 
129345 AA292440 
129468 J0304O 
129488 AA228107 
129498 AM49789 
129557 W01367 
129619 AA610116 
129627 AA258308 
129762 AA460273 
129884 AA286710 
130018 T68873 
130147 D63476 
130178 M52403 
130282 X55740 



AA307634 Hs.6650 
AW194301 Hs.339283 
H83265 Hs.8881 
BE178110 Hs.173374 
H43374 Hs.7890 
AW880709 Hs.283683 
NM_005Q32Hs.4114 
AA113262 Hs.17901 
W84768 

AL035858 Hs.3807 
AI751438 Hs.41271 
AA486620 Hs.41135 
NWL005985HS.48029 
AA987569 Hs.74313 
A1767947 Hs.50841 
D51174 Hs.272239 
AI7991Q4 Hs.178705 
AI557212 Hs.17132 
AW070211 Hs.102415 
AW969999 Hs.293658 
W73853 



Hs.49753 
Hs.172129 



H45100 
M18217 
W38197 

AA496205 Hs.193700 
AA305599 Hs.238205 
W07343 Hs.182538 
AA404418 
AI743860 

AB033030 Hs.300870 
AL133437 Hs. 110771 
AA829725 Hs.334437 
AA488687 Hs.284235 
BE019072 Hs.334802 
BE387335 Hs.283713 
H99093 Hs.343411 
AW070211 Hs.102415 
AF265555 Hs.250646 
NM_003654Hs104576 
BE410405 Hs.76288 
AL137540 Hs.102541 
AL359573 Hs.124940 
AW580945 Hs.330466 
AA825437 Hs.58875 
T40064 Hs.71968 
AA868063 Hs.104576 
AA088767 Hs.83883 
X69086 Hs.286161 
AI051390 Hs.1 16731 
AA627122 Hs.163787 
X02761 Hs.287820 
NM.005904HS.100602 
BE395085 Hs.10086 
U82108 Hs.101813 
BE076608 Hs. 105509 
Z28913 Hs.102948 
W28493 Hs.180414 
AW150717 Hs.345728 
AW867491 Hs.107125 
A1348027 Hs, 108557 
NM_001078Hs,109225 
BE222494 Hs.180919 
AA530892 Hs.171695 
R22497 Hs.110571 
AW410538 Hs.11 1779 
AW966728 Hs.54642 
AA449789 HsJ5511 
AL045404 Hs.46366 
AA209534 Hs.284243 
T40064 Hs.71968 
AA453694 Hs.12372 
AF055581 Hs.13131 
AA353093 

D63476 Hs.172813 
U20982 Hs.1516 
BE245380 Hs, 153952 



vacuolar protein sorting 45B (yeast homo 
Human DNA sequence from clone RP1-187J1 1 
ESTs, Weakly simQar to S41044 chromosom 
Homo sapiens cDNA FU1Q500 frs, done NT 
Homo sapiens mRNA for K1AA1671 protein, 
chromosome 8 open reading frame 4 
plasfin 3 (T Esofbrm) 

Homo sapiens, done IMAGE:3937015, mRNA, 
gb:zh53dfJ3.s1 Soares_fetaUiverjspleen_ 
FXYD domain-containing ton transport reg 
Homo sapiens mRNA fufl length Insert cDN 
endomudn-2 

snail 1 (drosophHa homotag), zinc finge 

WAA1285 protein 

ESTS 

lysosomal 

Homo sapiens cDNARJI 1333 ns, done PL 
ESTs, ModeiatelysimSarto I54374 gene 
Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
ESTs 
ESTs 

uveal autoanSgan with coOed coO domai 
Homo sapiens cDNA: FU21409 lis, done C 
Empirically selected from AFFX single pr 
Homo sapiens mRNA; cDNA DKFZp586i0324 (f 
hypothetical protein PRO2013 
phospholipid scrambtase 4 
gbzw37e02.s1 SoaresJotaLfetusJJb2HF8_ 
metaDofhtonein 1E (functional) 
WAA1204 protein 

Homo sapiens cONA: FU21904 fts, done H 
hypothefica) protein MGC4248 
ESTs, WeaWy slmBar to 138022 hypothefl 
Homo sapiens cONA FU 14680 fts, clone NT 
ESTs, Weakly simQar to S64054 hypotheti 
DEAD/H (As^iu-Ala-Asp/His) box poiypep 
Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
bacubvirai 1AP repeat-contakiing 6 
carbohydrate (keratan sulfate Gal-6) su) 
calpain 2, (m/il) large subunit 
netrtn 4 

GTP-binding protefri 

ESTs 

ESTs 

Homo sapiens mRNA; cONA OKFZp564F053 (fr 
carbohydrate (keratan sulfate Gal-6) su) 
transmembrane, prostate androgen induced 
Homo sapiens cONA FU13613 fis, done PL 
ESTs 
ESTs 

fibronectin 1 

MAD (mothers against decapentaplegic, Dr 
type I transmembrane protein Fn14 
solute canter family 9 (sodium/hydrogen 
CTL2gene 

enigma (LIM domain protein) 
heat shock 70kD protein 8 
STAT Induced STAT inhibitor 3 
plasmaiemma vesicle asso ci a t ed protein 
hypomeficai protein PP1057 
vascutar cell adhesion molecule 1 
inhibitor of DNA binding 2, dominant neg 
dual spedBcity phosphatase 1 
growth arrest and DNA-oamage-indudble, 
secreted protein, acidic, cysteine-rich 
methionine adenosyttransferase II. beta 
connecfive tissue growth factor 
K1AA0948 protein 
tetraspan NET-6 protein 

Homo sapiens mRNA; cDNA DKFZp564F053 (fr 
tripartite mofif protein TRIM2 



metalbthionein 1L 

PAK-interacfing exchange factor beta 
InsuHn-fike growth factor-binding prote 
5 1 nucleotidase (CD73) 
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130431 L10284 
130495 AA243278 
130553 AA430032 

130638 H16402 

130639 D59711 
130657 T94452 
130686 AA431571 
130776 R79356 
130818 AA280375 
130840 Z49269 
130899 241740 
131002 AA121543 
131080 JQ5008 
131084 AA101878 
131091 T35341 
131107 N87590 
131182 AA256153 
131207 W74533 
131319 U25997 
131328 V01512 
131328 V01512 
131328 V01512 
131328 V01512 
131509 X56681 
131555 AA161292 
131564 AA491465 
131573 AA046593 
131692 D50914 
131756 D45304 
131859 M90657 
131909 W69127 
131915 AA316186 
132046 AA384503 
132050 AA136353 
132151 AA044755 
132164 U84573 
132187 AA058911 
132303 AA620962 
132314 AA285290 
132358 X60486 
132398 R31641 
132421 AA489190 
132490 F137B2 
132520 AA257993 
132546 M24283 
132610 AA443114 
132716 T35289 
132840 N23817 
132883 AA047151 
132968 N77151 
132989 AA480074 
132999 Y00787 
133071 T99789 
133076 WB4341 
133099 L09209 
133147 D12763 
133149 T16484 
133161 AA253193 
133200 AA432248 
133220 X82200 
133260 AA083572 
133295 L00352 
133349 N75791 
133391 X57579 
133398 X02612 
133436 H44631 
133454 AA090257 
133478 X83703 
133491 L40395 
133510 AA227913 
133517 X52947 
133526 M11313 
133538 L14837 
133562 M60721 
133584 D90209 
133590 T67986 



AW505214 Hs.155560 
AW250380 Hs.109059 
AF062649 Hs252587 
AW021276 Hs.17121 
AI557212 Hs.17132 
AW337575 Hs.201591 
BE548267 Hs.337986 
AF167706 Hs.19280 
AW190920 Hs.19928 
BE048821 Hs.20144 
AI077288 Hs.296323 
AL05G295 Hs.22039 
NM_001955Hsi271 
NM-017413HS.303084 
AJ271216 Hs.22880 
BK20886 Hs.75354 
AJ824144 Hs.23912 
AF104266 Hs.24212 
NM_003155Hs^5590 
AW939251 Hs.25647 
AW939251 Hs.25647 
AW939251 Hs.25647 
AW939251 Hs.25647 
X56681 Hs.2780 
T47364 Hs.278613 
T93500 Hs.28792 
AA040311 Hs.28959 
BE559681 Hs.30736 
AA443966 Hs.31595 
AW960564 

NMJ16558HS.274411 
AJ161383 Hs.34549 
AB59214 Hs.179260 
AI267615 Hs.38022 
BE379499 Hs.173705 
AI752235 Hs.41270 
AA235709 Hs.4193 
BE177330 Hs.325093 
AF112222 Hs.323806 
NN1003542HS.46423 
AA876616 Hs.16979 
AW163483 Hs.48320 
NMJ»1290Hs.4980 
AA257992 Hs.50651 
M24283 Hs.168383 
M1 50511 Hs.5326 
BE379595 Hs.283738 
BE218319 Hs.5807 
AA373314 Hs.5897 
AF234532 Hs.61638 
AA480074 Hs.331328 
Y00787 Hs.624 
BE384932 Hs.64313 
AW946276 Hs.6441 
W16518 Hs.279518 
AA026533 Hs.66 
AA370045 Hs.6607 
AW021103 Hs.6631 
AB037715 Hs.183639 
NMJ)06074Hs.318501 
AA403045 Hs.6906 
AI147861 Hs.213289 
AW631255 Hs.8110 
AW103364 Hs.727 
NM.000499Hs.7ai2 
BE294068 Hs.737 
BE547647 Hs. 177781 
X83703 Hs.31432 
BE619053 Hs.170001 
AW880841 Hs.96908 
NNL000165HS.74471 
AU077051 Hs.74561 
NNL003257HS.74614 
M60721 HS74870 
D90209 Hs.181243 
T70956 Hs.75106 



calnexin 

mitochondrial ribosomal protein L12 
pituitary tumor-transforming 1 
ESTs 

ESTs, Moderately similar to I54374 gene 
ESTs 

Homo sapiens cDNA RJ10934 fts. done OV 

cystBlne-nch motor neuron 1 

hypofheficaJ protein SP329 

small inducible cytokine subfamily A (Cy 

serum/glucocorticoflj regulated kinase 

WAA0758 protein 

endofhefin 1 

ape&i; peptide Bgand forAPJ receptor 

dipeptidytpeptidase III 

GCN1 (general control of amino-arJd synt 

ESTs 

latrophiTin 

stenniocaJdn 1 

v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
v-fos FBJ murine osteosarcoma viral onco 
jun D proto-oncogene 
interferon, aJpha-inducible protein 27 
Homo sapiens cDNA FLJ1 1041 fis, clone PL 
ESTs 

K1AA01 24 protein 
ESTs 

transmembrane 4 superfamfly member 1 

SCAN dcmairHxmtaining 1 

ESTs, Highly similar to S94541 1 done 4 

chromosome 14 open reading frame 4 

ESTs 

Homo sapiens cDNA: FLJ22050 fis, clone H 

procollagen-lysine. 2-oxogiuterate 5-db 

DKFZP58601624 protein 

Homo sapiens cDNA: FU21210 fis, clone C 

pinin, desmosome associated protein 

H4 histone family, member G 

ESTs, Weakly simflar to A43932 mucin 2 p 

double ring-finger protein, Dorfin 

LIM domain binding 2 

Janus kinase 1 (a protein tyrosine kinas 

irrterceBular adhesion molecule 1 (CD54) 

amino arid system N transporter 2; porcu 

casein kinase 1, alpha 1 

GTPase Rab14 

Homo sapiens mRNA; cDNA DKFZp586P1622 (f 
myosin X 

rrypoiheficaJ protein FU13213 
irrterteuldn 8 

ESTs, WeaWy similar to AF257182 1 G-pro 
Homo sapiens mRNA; cDNA DKF2p586J021 (fr 
amytokj bete (A4) precursor-like protein 
interteukin 1 neceptor-Gke 1 
AX1N1 up-regulated 
hypothefical protein FU20373 
hypothefical protein FU10210 
Homo sapiens mRNA fuD length insert cDN 
Homo sapiens cDNA RJ23197 fis, done R 
tow density lipoprotein receptor (tamffl 
L-34iydraxyacykk)era7me A dehydrogenase 
inhibfi, beta A (actrvin A, acfivtn AB a 
cytochrome P450, subfamily I (aromafic c 
immediate early protein 
hypothetical protein MGC5618 
cardiac ankyrin repeat protein 
eukaryofic translate InrMon factor 
p53-induced protein 

gap junction protein, alpha 1 , 43kD (con 

aJpha-2-rnacrogtobuQn 

tight junction protein 1 (zona occrudens 

H2.D (DrosophilaHike homeo box 1 

activating transcription factor 4 (tax-r 

dusterin (complement lysis inhibitor. S 
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33617 
33651 
33671 
1 33678 
33681 
33722 
33730 
33750 
33802 
33825 



33859 



33975 
33977 
34039 
34075 
34081 
34164 
34203 
34238 
34299 
34332 
I34339 
34343 
34381 
34403 
34416 
34493 
34558 
34817 
34983 
34989 
35052 
35062 
35069 
35071 
35073 
35170 
35196 
35348 
i34404 
439561 
100082 
32817 
30150 
00104 
447973 
332613 
00113 
33980 
00129 
0)154 
100169 
29718 
100190 
I34742 
100211 
I00238 
30283 
34237 
00248 
00256 
00262 
3433 
00261 
00294 
00327 
00335 
1344% 
00338 
i35152 
34269 
00372 
34304 



AA148318 

U97105 

T25747 

KQ2574 

D78577 

X53331 

S73591 

X9S735 

L16862 

U44975 

M97796 

U86782 

AA099391 

M19267 

D29992 

L19314 

S78569 

U28811 

177886 

C14407 

M60278 

R81509 

AA487558 



AA478971 

D50683 

U56637 

M61199 

M28882 

X15183 

S53911 

U20734 

D28235 

AA236324 

AA148923 

AA174183 

AA455311 

L08069 

AM52000 

AA282140 

J02B54 

AA442054 

AB000450 

AB002380 

AB003103 

AB004884 

AF000573 

AF008937 

AF009301 

AF009368 

D00591 

D00760 

D11139 

D14657 

D14878 

D17716 

D21090 

D26135 

D26528 

D30742 

D31762 

D31765 

D31888 

D38128 

D38500 

D38551 

D42087 

D49396 

D55640 

D63391 

D63477 

D63483 

D64015 

D79990 

D79997 

D80010 



BE244334 Hs.75249 
A1301740 Hs.173381 
AW503116 Hs.301819 
AW247252 
AI352558 

AW969976 Hs279009 
BE242779 Hs.179526 
BE410769 Hs.75873 
AW239400 Hs.76297 
BE616902 Hs^85313 
BE222494 Hs.180919 
U85782 Hs.178761 
U48959 Hs^11582 
M19267 Hs.77899 
C18356 Hs.295944 
AI125639 Hs.250666 
NMJ)02290Hs.78672 
NM.012201HS.78979 
AL034349 Hs.79005 
AW245540 Hs.79516 
AA161219 Hs.799 
AA102179 Hs.160726 
AW580939 Hs.97199 
D86962 Hs.81875 
R70429 Hs.81988 
D50683 Hs.82028 
AI557280 Hs.184270 
AA334551 

X68264 Hs.211579 
M30827 Hs, 289088 
NM_001773H&85289 
AU076592 Hs.198951 
D28235 Hs.196384 
AW9S8058 Hs.92381 
AL136653 Hs.93675 
AK000967 Hs.93872 
AA876372 Hs.93961 
W27190 Hs.94 



W55956 
T53169 
C03577 
U80983 



Hs.94030 
Hs.9587 
Hs.9615 
Hs.268177 



AB000450 Hs.82771 
AF180681 Hs.6582 
AA130080 Hs.4295 
N27852 Hs.57553 
BE094848 Hs.15113 
AF008937 

AB011169 Hs^0141 
AF029674 Hs.173422 
NM.001269HS.84746 
AA294921 Hs.348024 
AA469369 Hs.5831 
H60720 Hs.81892 
ALD37228 Hs.82043 
NJO02410Hs.121502 
M91401 Hs.178658 
N!O01346Hs.89462 
D26528 Hs.123058 
L24959 Hs.348 
NM.012288HS.153954 
D31765 Hs.170114 
NM.015156HS.78398 
D25418 Hs.393 
D38500 Ks278468 
N 92036 Ks.81848 
AF091035 Hs.184627 
AA331881 Hs.75454 
D55640 

AW247529 Hs.6793 
D63477 Hs.84087 
D86864 Hs.57735 
M96954 Hs.182741 
NWL014737Hs£0905 
NWL014791HS.184339 
BE613486 Hs.81412 



ADP-noosytafon factor-flee 6 Interacfi 

dihydropyrirradinase^lke 2 

zinc finger protein 146 

nucleoside phosphorytase 

tyrosine 3-monooxygenase/tryptophan 5-mo 

matrix Gla protein 

upregulated by 1^5^ihydrcocyvitamin D-3 
zyxin 

6 protein-coupled receptor kinase 6 

core promoter element binding protein 

inhibitor of DMA binding 2, dominant neg 

26S pioteasome-assodated padl hamotog 

myosin, fight polypeptide kinase 

tropomyosin 1 (alpha) 

tissue factor pathway inhibitor 2 

hairy (Drosophila)-homotog 

tambun, alpha 4 

Gokji apparatus protein 1 

protein tyrosine phosphatase, receptor t 

brain abundant membrane attached signal 

diphtheria toxin receptor (heparin-binal 

Homo sapiens cDNA FU 1 1680 fts, clone HE 

complement component C1 q receptor 

growth factor receptor-bound protein 10 

disabled (DrosophQa) homdog 2 (mitogen 

transforming growth factor, beta recepto 

capping protein (acfin filament) muscle 

sperm specific antigen 2 

melanoma cefl adhesion molecule 

heat shock 90kD protein 1, alpha 

CD34 antigen 

jun B proto-oncogene 

prostaglaiKOrhendoperoxide synthase 2 (p 

nudix (nucleoside diphosphate finked moi 

decidual protein induced by progesterone 

K1AA1682 protein 

Homo sapiens mRNA; cDNA DKFZp667D095 (fr 
OnaJ (Hsp40) homoiog, subfamily A, membe 
Homo sapiens mRNA; cDNA DKFZp586E1624 (f 
Homo sapiens cDNA: FU22290 fis, clone H 
myosin regulatory fight chain 2, smooth 
phosphoGpase C, gamma 1 (formerly subty 
vaccinia related kinase 2 
Rho guanine exchange factor (GEF) 12 
proteasome (prosome, macropain) 26S subu 
tousled-fike kinase 2 

homogenfeate 1,2-dioxygenase (homogenti 
syntaxin 16 

similar to S. cerevisiae SSM4 

K1AA1605 protein 

chromosome condensation 1 

v-rai simian leukemia viral oncogene horn 

tissue inhibitor of metafloprotKnase 1 

K1AA0101 gene product 

D123 gene product 

mannosyi (alpha-1,6-Hilycoprotein beta- 
RAD23 (S. cerevisiae) homoiog B 
diacyigrycerol kinase, gamma (90kD) 
DEAD/H (Asp^iiMla^spyHis) box polypep 
catcium/cairnoduiirwtependent protein kin 
TRAM-iflce protein 
KJAA0051 protein 
K1AA0071 protein 

prostaglandin 12 (prostacyefin) receptor 
postmeiofic segregation increased 2-fike 
RAD21 (S. pombe) homoiog 
K1AA01 18 protein 
peroxiredoxin 3 

gfcHuman monocyte PABL (pseudoautosomal 

platelet-activating factor acetyihydroJa 

WAA0143 protein 

acetyl LDL receptor; SREC 

TTA1 cytotoxic granule-associated RNA-bi 

Ras association (RalGOS/AF-6) domain fern 

K1AA0175 gene product 

Bpinl 
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100394 
100405 
100418 
133154 
134347 
444099 
100438 
134593 
100481 
100552 
100591 
100652 
100862 



D84276 
D86425 



100905 
100945 
100950 
100984 
135407 
130149 
131877 
101016 
134786 
134100 
134078 
101132 
134849 
332736 
101152 
135397 
432642 
101168 
421155 
101226 
415138 
134739 
130155 
440538 
409916 
101294 
101300 
101310 
130344 
101381 
101381 
415678 
133780 
101398 
101447 
101458 
101470 
134604 
101478 
133519 
131185 
134116 
133999 
130174 
129963 
132983 
133900 
101543 
101545 
101620 
134691 
133595 
101700 
101714 
134246 
101760 
415022 
415022 
415022 
415022 
415022 
101791 



D87012 
087075 
D87432 
087448 
087845 

HG1098-HT1098 
HG2167-HT2237 
HG2415-HT2511 
HG28254TT2949 
HG2887-HT3031 
HG4660-HT5073 
HG4704-HT5146 
HG884-HT884 
HG919*KT919 
J00212 
J04029 
J04O31 



J04543 
L06139 
L07540 
L08895 
L11239 
L11353 
L13773 
L13800 
L14922 
L15189 
L15388 
L16895 
L27476 
L27624 
L32976 
L33404 
L35263 
L37347 
L40371 
L40391 
L41607 
L77566 
M13928 



M14016 
M14219 
M15796 
M21305 
M22092 
M22898 
M22995 
M23379 
M24400 
M25753 
M27691 
M28213 
M29550 
M29971 
M30269 
M31158 
M31166 
M31210 
M55420 
M59979 
M62810 
M64710 
M68874 
M74524 
M80254 
M81780 
M81780 
M817B0 
M81780 
M81780 
M83822 



D84284 Hs.66052 

AW291587 Hs.82733 

D88978 Hs.84790 

D87012 Hs.194685 

AF164142 Hs.82042 

D87432 Hs.10315 

AA013051 Hs.91417 
NM>000437Hs^34392 

X70377 Hs.121489 

AA019521 Hs^01946 
NM.004091Hs^31444 

BE613608 Hs.142653 

AI368680 Hs.816 

AL039123 Hs.103042 

L12260 Hs.172816 

AF002225 Hs.180686 

AF128542 Hs.166846 
J00212 

J04029 Hs.99936 

AW0S7805 H&172665 

J04088 Hs.156345 

JD4543 H378537 

T29618 Hs.89540 

AA460085 Hs.171075 

L08895 Hs.78995 

L11239 Hs36993 

BE409525 Hs.902 

283689 Hs.1 14765 



L14922 Hs.166563 
BE297635 Hs.3069 
NNL005308HS.211569 
H87879 Hs.102267 
AF083892 HsJ5608 
C18356 Hs.295944 
NM.002419HS.89449 
AA101043 Hs.151254 
W76332 Hs.79107 
BE313625 Hs.57435 
AF168418 Hs.116784 
BE535511 
U1607 Hs.934 
AW250122 Hs.154879 
AW675039 Hs.1227 
AW675039 Hs.1227 
AW005903 Hs.78601 
AA557660 Ha.76152 
BE267931 Hs.78996 
M21305 
M22092 

NM.000546Hs.1846 
NM_002884Hs.865 
NNL002890HS.758 
AW583062 Hs.74502 
BE280074 Hs.23960 
R84694 Hs.79194 
AA535244 Hs.78305 



M29551 
M29971 



M31158 
M31166 
BE246154 
S55271 
AW382987 
AA393273 
D90337 
M68874 
028459 
MB0254 
X59960 



X59960 



M83822 



Hs.151531 
Hs.1384 

Hs.77439 

Hs.2050 

Hs.154210 

H&247930 

Hs.88474 

Hs.75133 

Hs.247916 

Hs.211587 

Hs.80612 

Hs.173125 

Hs.77813 

Hs.77813 

Hs.77813 

Hs.77813 

Hs.77813 

Hs.62354 



CD38 antigen (p45) 

nkk>gen2 

K1AA0225 protein 

topotsomerase (DMA) 111 beta 

solute carrier family 23 (rwdeobase tra 

solute carrier family 7 (caflonic amino 

topoisomerase (DNA) II binding protein 

platelet-activating factor acetythydrola 

cystafinD 

lysosomal 

Homo sapiens, Similar to hypothetical pr 
ret finger protein 

SRY (sex determining region Y)-box 2 
microtubute-assodated protein 1B 
neuregufin 1 

ublquitin protein Ggase E3A (human pap] 
polymerase (DNA directed), epsilon 
Empirically selected from AFFX single pr 
keratin 10 (epWermoryfic hyperkeratosis 
melhylereietrahydrofolate dehydrogenase 
topoisomerase (DNA) il alpha (170kD) 
annexinA7 

TEK tyrosine kinase, endothelial (venous 
repOcatjon tactorC (activator 1)5 (36 
MADS box transcription enhancer factor 2 
gastruJation brain homeobox 1 
neurofibroma 2 (bSaterai acoustic neur 
myeloicVrymphoid or mixed-lineage leukem 
spindle pole body protein 
replication factor C (activator 1) 1 (14 
heat shock 70kD protein 9B (mortaIin-2) 
G protein-coupled receptor kinase 5 
lysyl oxidase 

tjght junction protein 2 (zona ocdudens 
tissue factor pathway inhftto 2 
mitogen-acfivated protein kinase kinase 
kaJBcrein 7 (cfiymotn/pfc, stratum com 
mitogen-activated protein kinase 14 
solute earner family 11 (proton-coupled 
thyroid hormone receptor interactor 4 
transmembrane trafficking protein 
glucosamine! (N-acetyl) transferase 2. 1 
DIGeorge syndrome critical region gene D 
aminolevuOnate, delta-, dehydratase 
antinotevuGrcate, delta*, dehydratase 
uroporphyrinogen decarboxylase 
decorin 

proliferating ceil nuclear antigen 
gbtHuman alpha satellite and satellite 3 
gb:Human neural cell adhesion molecule ( 
tumor protein p53 (Ltf raurneni syndrome) 
RAP1A, member of RAS oncogene family 
RAS p21 protein activator (GTPase activa 
cfryrnotrypsinogen B1 
cycflnBI 

cAMP responsive element binding protein 
RAB2, member RAS oncogene rarnBy 
protein phosphatase 3 (formerly 2B), cat 
O-6-rnetrrylguanine-DNA metnytaisferase 
nidogen (enactJn) 

protein kinase, cAMP-dependent, reguiato 
pentaxin-feiated gene, rapidly induced *b 



Epsilon , IgE 

prostag lartfrn-emioperoxide synthase 1 (p 
transcription factor 6-Dke 1 (rrutochond 
natriuretic peptide precursor C 
phosphoBpase A2, group IVA (cytosoGc, 
ubkruifin-conlugafing enzyme E2A (RAD6 h 
peptklylprofyl isomerase F (cydophlBn 
spftingomyeOn phosphodiesterase 1, acid 
sphingomyelin phosphodresterase 1, add 
srjfiingomyeBn phosr^rodissterase 1, add 
sphingomyelin phosphodiesterase 1 , acfaj 
sphingomyelin phosphodiesterase 1, add 
(x^cjvtsfon eyrie 4-fike 
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101812 
01813 
33396 
28161 
29026 
01901 
34831 
34039 
442355 
01975 
101977 
101978 
01998 
02003 
02007 



M86934 
M87338 
M96326 
M96954 
M98833 
S66793 
S72370 
S78569 
S79873 
S83325 
S83364 



32951 
35389 
02048 
30145 

303153 
20269 
02095 
02123 
02126 
02133 
02139 
102162 
102164 

427653 
31817 
02200 
02210 
02214 
32811 
31319 
02256 
32316 
02269 
17526 
02293 
02298 
I02325 

428734 
02361 
02367 
02388 
02394 
29829 
02409 
33746 
02423 
32828 
32828 
32828 



25322 
02450 
29350 
02534 
30457 
35065 
02560 
02567 
17173 
02638 
32736 
33070 
02663 



02735 
102741 
130564 
30564 
32164 



U01212 
U01922 
U02556 
UD2680 
U03272 
U04209 
U05237 
U07225 
U07620 
U09759 
U09820 
U11313 
U14518 
U14575 
U15173 
U15932 
U18291 
U18300 
U18383 
U20536 
U21551 
U23028 
U23752 
U25435 
U25997 
U28251 
U28831 
U30245 
U32315 
U32439 
U32849 
U35139 
U36764 
U39400 
U39657 
U41344 
U41766 
U41813 
U43286 
U44378 
U44754 
U47011 
U47011 
U47011 
U47011 
U47077 
U48251 
U50535 
U56833 
U58091 
U58837 
U59289 
U59863 
U67122 
U67319 
U68019 
U69611 
U70322 
U73524 
U79267 
U79291 
U82671 
U82671 
U84573 



BE439894 Hs.78991 
NM_002914Hs.139226 
M96326 Hs.72885 
M96954 Hs.182741 
AL120297 Hs.108043 
H38026 Hs.308 
AA853479 Hs.89890 
NMJ>02290Hs.76672 
AA456539 Hs.8262 
AA079717 Hs.283664 
AF112213 Hs.184062 
BE561610 HsJSm 
U01212 Hs.248153 
U01922 Hs.125565 
U02556 Hs.75307 
BE245149 Hs.82643 
U03272 Hs.79432 
AW821182 Hs.61418 
U05237 Hs.99872 



U07225 
U34820 
U09759 
U72937 
U11313 



Hs.339 

Hs.151051 

Hs.246857 

Hs.96264 

HsJ5760 



NM.001809Hs.1594 
AW950870 Hs.78961 
AU076845 Hs.155596 
NM.004419Hs.2128 
AA450274 Hs.1592 
NM.0001O7HS.77602 
AA159001 Hs.180069 
U20536 Hs.3280 
AA232362 Hs.157205 
BE619413 Hs.2437 
U23752 Hs.32964 
U25435 Hs57419 
NM.003155Hs^5590 
U28251 Hs.53237 
U28831 Hs.44566 
U30245 

AA568906 Hs.82240 
AF090116 Hs.79348 
AA382169 Hs.54483 
AI815867 Hs*0130 
BE303044 Hs. 192023 
AA223616 Hs.75859 
U39656 Hs.1 18825 
AA362S07 Hs.76494 
NM_003816Hs.2442 
AF010258 HS.127428 
BE300330 Hs.118725 
AW410035 Hs.75862 
Z47542 Hs.179312 
AB014615 Hs.57710 
AB014615 Hs.57710 
AB014615 Hs.57710 
AB014615 Hs.57710 
U63630 Hs.155637 
U48251 Hs.75871 
U50535 Hs.110630 
U96759 Hs.198307 
AB014595 Hs.155976 
AA019401 Hs.93909 
R97457 Hs.63984 
U63830 Hs.146847 
U61397 Hs*1424 
U67319 Hs.9216 
AW081883 Hs^11578 
U92649 Hs.64311 
NM_002270Hs.168075 
U73524 Hs-87465 
AF111106 HS3382 
AW959829 Hs.83572 
U82671 Hs36980 
U82571 HsJS6980 
AT752235 Hs.41270 



DMA segment numerous copies, expressed 
replication factor C (activator 1) 2 (40 
azurooioln 1 (cafionic antimicrobial pro 
TTA1 cytotoxic granule-associated RNA-bi 
Friend leukemia vims Integration 1 
arrestin 3, retinal (X-arrestin) 
pyruvate carboxyfase 
tamfofn, alpha 4 

ryscsomaJ-assodated membrane protein 2 

aspartate beta-hydroxyiase 

putative Rab54nteracrjng proteki 

putative transmembrane protein; homolog 

olfactory marker protein 

trans ic case of Inner mitochondrial membr 

t*conipiex^ss£ ri 3te^MB5 fis , *expressed 1* 

protein tyrosine kinase 9 

fibrillin 2 (congenital contractural era 

micrafibnTiaj-assoriated protein 1 

fetal Alzheaner antigen 

purinerglc receptor P2Y, G-protein coup! 

mitogen-acfivated protein kinase 10 

mtogen-acfivated protein kinase 9 

alpha thalassemla/mentaJ retardation syn 

sterol carrier protein 2 

centromere protein A (17kD) 

protein phosphatase 1 , regulatory (tnhEb 

BCL2/adenovirus E1B 19kD-titeracting pro 

dual specificity phosphatase 5 

CDC 16 (ceO division cyde 16, S. cerevi 

damage-specific DNA binding protein 2 (4 

nuclear respiratory factor 1 

caspase 6, apoptosis-related cysteine pr 

branched chain aminotransferase 1, cytos 

eukaryofic translation initiation factor 

SRY (sex determining region Y>box 1 1 

CCCTC-binding factor (zinc finger protei 

stannfocaWn 1 

ESTs, Highly similar to Z169JiUMAN ZINC 
K1AA1641 protein 

gbrHuman myetomonocytfc specific protein 
syntaxin 3A 

regulator of G-protein signalling 7 
N-myc (and STAT} rhteractor 
need in (mouse) homolog 
eukaryotic translation initiation factor 
chromosome 11 open reading frame 4 
mitogen-activated protein kinase kinase 
proline arginine-rich end leucine-rich r 
a dislntegrin and rnetaitoproteinase doma 
homeo box A9 

selenophosphate synthetase 2 
MAD (mothers against decapentaplegic, Dr 
small nuclear RNA activating complex, po 
fibroblast growth factor 8 (artdrogen-ind 
fibroblast growth factor 8 (androgen-ind 
fibroblast growffi factor 8 (androgen-ind 
fibroblast growth factor 8 (androgen-ind 
protein kinase, DNA-activated, catalytic 
protein kinase C binding protein 1 
Human BRCA2 region, mRNA sequence CG006 
von HippeRindau binding protein 1 
cuHin 4B 

cycBc nucteofiae gated channel beta 1 

cadherin 13, H-cadherin (heart) 

TRAF family member-associated NFKB adiv 

ubiquiiin-Gxe 1 (sentrin) 

caspase 7, apoptosis-rebted cysteine pr 

Homo sapiens cONA: FU23G37 fe, done L 

a disintegrin and rnetaitoproteinase doma 

karyopherin Omportln) beta 2 

ATP/GTP-binding protein 

protein phosphatase 4, regulatory subuni 

hypotheficai protein MGC14433 

melanoma antigen, family A, 2 

melanoma antigen, family A, 2 

rjrocoCagen-lysme, 2-oxoglutarate 5-dto 
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02823 
02826 
02831 



29777 
34161 
34854 
29257 
13985 
19768 
02915 
34656 



02968 
02971 
34037 
34037 
03023 
03037 
30282 
34542 
28568 
28568 
03093 
13076 
29063 
24460 

411077 
103181 
03184 
03194 
03208 
29698 
31486 
30729 
03334 
32645 
35094 
03352 
03352 
03353 
32173 
03371 
31584 
03376 
03378 
28510 
03410 
33490 

332689 
03438 
03440 
03452 
33536 
20234 

426502 



32083 
03500 
34389 
32084 
03540 
33152 
03548 
03612 
29092 
103692 
1036% 
29796 
434993 



31887 
03723 



33260 



U90914 D85390 Hs.5057 cajtaxypepfidase D 

U91316 NM_007274Hs£679 cytosaic acyl coenzyme A mioester hydr 

U91932 AA262170 Hs.80917 adaptor-related protein complex 3, sigma 

U96131 BE264974 Hs.6566 thyroid hormone receptor interactor 13 

U97018 U97018 Hs.12451 echlnoderrn rricrotobufe-*ssodated protei 

U97188 AA634543 Hs.79440 IGF-0 mRNA-ohding protein 3 

V00503 J03464 Hs. 179573 collagen, type I, alpha 2 ' 

X04327 AW163799 Hs.198365 2,34ii5phosphog!ycerate mutase 

X06389 AI018666 H&J5667 synaptophysln 

X07496 T72104 Hs.93194 apollpoprotein A-l 

X0782O X07820 Hs?758 matrix metaltoproteinase 10 (stromelysin 

X14787 AI750878 Hs.87409 mrorrtosponoTn 1 

X15525 NM.001610HS.75589 add phosphatase 2, lysosomal 

X16396 AU076611 Hs. 154672 methylene tetrahyrtcfolate dehydrogenase 

X16609 X16609 Hs. 183805 ankyrin 1, erythrocytic 

X53586 A1808780 Hs^27730 irrtegrin, alpha 6 

X53586 A180B780 Hs.227730 Irrtegrin, alpha 6 

X53793 AW500470 Hs.1 17950 multifunctional porypepfide similar to S 

X54936 BE0 18302 H&2894 placental growth factor, vascular endoth 

X55740 BE245380 Hs.153952 5 1 nucleotidase (CD73) 

X57025 M 141 56 Hs.85112 insuWXe growth factor 1 (somatomedi 

X60673 H12912 Hs^74691 adenylate kinase 3 

X60673 H12912 H&274691 adenylate kinase 3 

X60708 S79876 Hs.44926 dipeptidytpeptidase IV (CD26, adenosine 

X62048 U10564 Hs.75188 weel (S. pombe) homolog 

X63097 X63094 Hs^83822 Rhesus Wood group, D anSgen 

X63563 BE275979 Hs.296014 polymerase (RNA) II (DNA directed) polyp 

X64037 AW977263 Hs.68257 general transcription factor IIF. polype 

X69536 X69636 Hs.334731 Homo sapiens, cbne IMAGE:3446306, mRNA, 

X69878 U43143 Hs.74049 rms-related tyrosine kinase 4 

X70649 NM_004939Hs.78580 DEAD/H (Asp-GIu-AIa-Asp/His) box polypep 

X72841 AW411340 Hs31314 retinoWastoma^nding protein 7 

X74987 BE242144 Hs.12013 ATP-binding cassette, sub-family E (OABP 

X83107 F06972 Hs.27372 BMX non-receptor tyrosine kinase 

X84194 AI963747 Hs.18573 acytphosphatase 1, erythrocyte (common) 

X85753 NM_G01260Hs.25283 cydirwiependent kinase 8 

X87870 AI654712 Hs.54424 hepatocyte nuclear factor 4, alpha 

X89066 NM_003304Hs.250687 transient receptor potential channel 1 

X89398 H09366 Hs.78853 uracfiONA glycosytase 

X89398 H09366 Hs.78853 uracO-DNA grycosytase 

X89399 X89399 Hs. 11 9274 RASp21 protein activator (GTPase acfiva 

X89426 X89426 Hs.41716 endoflielial ceO-specific molecule 1 

X91247 X91247 Hs.13046 mtaredoxin reductase 1 

X91648 AA598509 Hs.29117 purine-rich element binding protein A 

X92098 ALD36166 Hs.323378 coated vesicle membrane protein 

X92110 AL1 19690 Hs.153618 HCGVHM protein 

X94703 X94703 RAB28, member RAS oncogene family 

X96506 AA158294 Hs^95362 DR1 -associated protein 1 {negative cofac 

X97230 AF022044 Hs 274601 killer cell immunoglobuIin-Cke receptor 

X97230 AF022044 Hs.274601 kilter cell immunog tobuMke receptor 

X98263 AW175781 Hs.152720 M-phase phosphoprotein 6 

X98296 X98296 Hs.77578 ubiqurtin specific protease 9, X chromos 

X99584 NM_006936Hs.851 19 SMT3 (suppressor of mif two 3, yeast) ho 

Y00264 W25797.cornp Hs. 177486 amyloid beta (A4) precursor protein (pro 

Y07566 AW404908 Hs.95038 Ric (DrosophflaHIke, expressed in many 

Y07759 Y07759 Hs.170157 rnyoslnVA (heavy polypepfide 12, mycodn) 

Y07827 NM.007048Hs^84283 tRityrophStn, subfemSy 3, member A1 

Y07857 BE386490 Hs^79663 Pcrin 

Y09443 AW408009 Hs^2580 alkytglycerone phosphate synthase 

Y09858 Y09858 Hs.82577 spmdBn-fike 

Y12394 NM.002267H&3886 karyopherin alpha 3 (importin alpha 4) 

Z11559 NfVL002197Hs.154721 aconitase 1, soluble 

Z11695 211695 Hs.324473 mitogen-actlvated protein kinase 1 

Z15005 Z15005 Hs.75573 centromere protein E (312kD) 

Z46261 BE336654 Hs.70937 H3 histone family, member A 

AA011243 D56365 Hs.63525 poly(rC}-bindmg protein 2 

AA018418 AW137912 H&227583 Homo sapiens chromosome X map Xp1 123 L- 

AA018758 ' AW207152 Hs.186600 ESTs 

AA018804 BE218319 Hs3807 GTPaseRab14 

AA031993 AA306325 Hs.4311 SUMO-1 acfivating enzyme subun&2 

AA044217 BE264633 Hs.143638 WD repeat domain 4 

AA046548 W17064 Hs.332848 SW1/SNF related, matrix associated, acti 

AA057447 BE274312 Hs^14783 Homo sapiens cONA FU14041 fis, clone HE 

AA058376 W20296 Hs^88178 Homo sapiens cDNA FU1 1988 fis, done HE 

AA083572 AA403045 Hs.6905 Homo sapiens cONA: FU23197 fe, done R 

AA085696 AA085696 Hs.169600 WAA0826 protein 
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103766 AA088744 

103767 AA089688 
132051 AA091284 
103773 AA092700 
135289 AA09296B 
403559 AA094800 
103794 AA100219 
131471 AA114885 
134319 AA129547 
103807 AA133016 
446392 AA149507 
129863 AA151005 
103850 AA187101 
103855 AA195179 
103861 AA206236 
130634 AA227621 
447735 AA248283 
103909 AA249611 
458928 AA282640 
415824 AA287199 
129013 AA313990 
129435 AA314256 
103988 AA314369 
104000 AA324364 
425284 AA329211 
128629 AA399187 
133281 AA421079 
104104 AA422029 
332455 AA425230 
132091 AA447052 
135073 AA452000 
131367 AA456687 
129593 AA487015 
133505 C01527 
132064 C01714 
442351 C01811 
131427 C02352 
433892 C02375 
104282 C14448 
134827 D16611 
425330 D25216 
131742 D31352 
456935 D58024 
425218 D80897 
104334 D82614 
134593 D87845 
134731 D89377 
445776 H06583 
131670 H40732 
104394 H46617 
104402 H56731 
439130 H75570 
129077 H788SS 
104417 H81241 
134927 L36531 
129280 M63154 
134498 M63180 
104460 M91504 
104488 N56191 
131248 N78483 
130017 R14652 
104530 R20459 
104534 R22303 
104544 R33779 
133328 R36553 
104567 R64534 
129575 R70621 
130776 R79356 
104599 R84933 
104660 AA007160 
104667 AA007234 
104718 AA0184Q9 
104764 AA025351 

104786 AA027168 

104787 AA027317 
134079 AA029423 



AI920783 Hs.191435 
BE244667 

AA393968 Hs, 180 145 

A1219323 Hs.101077 

AW372569 Hs.9788 

AW970843 Hs.55682 

AF244135 Hs.30670 

AA164842 Hs. 1925 19 

BE304999 Hs2B5754 

AW958264 Hs, 103832 

AF142419 Hs.15020 

BE379765 Hs. 129872 

AA187101 HS213194 
W02363 

AA206236 Hs.4944 

AI769067 Hs.127824 

AA775268 Hs.6127 

AA249611 Hs.47438 

AF043117 Hs.24594 

D42039 Hs.78871 

AA371156 Hs.107942 

AF151852 Hs.1 11449 

AA314389 Ks.342849 

AJ146527 Hs.80475 

AF155568 Hs.348043 

AL096748 Hs.102708 

AKD01601 Hs.69594 

AA422029 Hs.143640 
NWL005754Hs^20689 
AW954243 

W55956 Hs.94030 

AI750575 Hs.173933 

AI338247 Hs.98314 

AI630124 Hs.324504 

AA121098 Hs.3838 

W52642 Hs.8261 

AF151B79 Hs.26706 

AI929357 Hs.323966 

C14448 Hs.332336 

BE314037 Hs.69866 

D25216 Hs.155650 

AA961420 Hs.31433 

AA370362 Hs.57958 

nm_014909Hs.155182 

D82614 H&78771 
NM 000437HS.234392 
D89377 Hs.89404 
NM 001310HS.13313 
H03514 Hs.15589 
AA129551 Hs.172129 
H56731 Hs.132956 
AA306090 H3.124707 
N74724 Hs.108479 
AI819448 Hs.320861 
L36531 Hs.91295 
M63154 Hs.110014 
AW246273 Hs.84131 
AW955705 Hs.62604 
N56191 Hs.106511 
AI038989 Hs.332633 
AK000096 Hs.143198 
AK001676 Hs.12457 
R22303 

AI091173 Hs^22382 
AW452738 Hs^65327 
AA040620 Hs£672 
F08282 Hs37B428 
AF167706 Hs.19280 
AW815036 Hs.151251 
BE298665 Hs.14846 
AI239923 Hs.63931 
AI143020 Hs36250 
AI039243 Hs378585 
AA027167 Hs.10031 
AA027317 

AK001751 Hs.171835 



ESTs 

CGJ-100 protein 
HSPC030 protein 

ESTs, WeaWy similar to T223S3 hypotheti 

hypdheBcal protein MGC1 0924 similar to 

eukaryotfc translation Initiation factor 

hepatocellular carcinoma-associated arrti 

WAA1600 protein 

fumarate hydratase 

sirnflar to yeast UpQ, variant B 

homobg of mouse quaking OKI (KH domain 

sperm associated antigen 9 

hypothetical protein MGC10895 

hypothetical protein RJ 10330 

hypothetical protein RJ12783 

ESTs, WeaWy similar to T28770 hypoftetj 

Homo sapiens cONA: FU23020 fe, done L 

SH3 domain binding glutamic add-ndh pr 

ubiquitination factor E4B (homologous to 

mesoderm development candidate 2 

DKFZP564M112 protein 

CGl-94 protein 

ADP-ribosyfation factor-lilce 5 

polymerase (RNA) II (DNA directed] polyp 

NS1 -associated protein 1 

DKFZP434A043 protein 

high-rnobllrry group 20A 

ESTs, WeaWy similar to hyperpoiarizafto 

Ras-GTPase-activating protein SH3-domain 

WAA0251 protein 

Homo sapiens mRNA; cONA DKF2p586E1624 (f 
nuclear factor l/A 

Homo sapiens mRNA; cDNA DKFZp586L0120 (f 
Homo sapiens mRNA; cDNA DKFZp586J0720 (f 
serum-inducible kinase 
hypothetical protein FU22393 
CGI-121 protein 

Homo sapiens clone H63 unknown mRNA 
EST 

coproporphyrinogen oxidase (coproporphyr 

KIAA0014 gene product 

ESTs 

EGF-W-iatropruTirHelated protein 
K1AA1036 protein 
phosphogrycerate kinase 1 
platelet-activating factor acetylhydrola 
msh (Drosophila) homeo box homobg 2 
cAMP responsive element binding protein- 
ESTs 

Homo sapiens cDNA: FU21409 fo, done C 

ESTs 

ESTs 

ESTs 

KruppeHike factor 6 
integrin, alpha 8 

gastric intrinsic factor (vitamin B synt 

threonyl-fflNA synthetase 

Homo sapiens, done IMAGE4299322, mRNA, 

protocadherin 17 

Bardet-Biedl syndrome 2 

inhibitor of growth family, member 3 

hypofoetical protein FU10814 

gb:yh26b09.r1 Soares placenta Nb2HP Homo 

ESTs, WeaWy similar to p40 [Rsapiens] 

hypothetical protein DKFZp76 11141 

hypothetical protein AF 140225 

progestin induced protein 

cysteJne-rich motor neuron 1 

ESTs 

Homo sapiens mRNA; cDNA DKFZp564D016 (fr 
ESTs 

ESTs, Weakly similar to 138022 hypotheS 
ESTs 

WAA0955 protein 

gbze97d11.s1 Soares_feteUiearLNbHH19W 
hypoffiefca) protein FU10889 



131 



WO 02/079492 



PCT/US02/04915 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



04804 
04865 
30828 
04907 
04943 
05013 
05024 
32592 
05038 
05077 
05096 
29215 
05169 
32796 
[27210 
05200 
30114 
05330 
05337 
22040 
05376 
05397 
31679 
31991 
21305 
05489 
05508 
05539 
135172 
131569 
431129 
05643 
05659 
05666 
05674 
05709 
05722 
05765 
15951 
30884 
05962 



06008 

457322 
34222 

446954 
06141 

447973 
06157 
28314 

446727 
06196 
57714 
33200 
06302 
06328 

450534 
06423 

439608 
I06477 
06503 

446999 
06543 

442007 



08593 
06596 
23064 
06636 
06654 
31353 
06707 
■52909 
06717 
453141 
106747 



M031357 
AA045136 
AA0534C0 
AA055829 
AA065217 
AA116054 
AA126311 
AA129390 
AA130273 
AA142919 
M150205 
AA176867 
AA180321 
AA180487 
AA187634 
AA195399 
AA234717 
AA234743 
AA234957 
AA235604 
AA236559 
AA242658 
AA251776 
AA251909 
AA252672 
AA256157 
AA256680 
AA258873 
AA262727 
AA281451 
AA281545 
AA282069 
AA283044 
AA283930 
AA284755 
AA291268 
AA291927 
AA343514 
AA398109 
AA398109 
AA405737 
AA406610 
AA411465 
AA416886 
AA424013 
AA424148 
AA424558 
AA424961 
AA425357 
AA425921 
AA426220 
AA427735 
AA430673 
AA432248 
AA435896 
AA436705 
AA446561 
AA448238 
AA449756 
AA450303 
AA452411 
AA454566 
AA454667 
AA456437 
AA456646 
AA456826 
AA456981 
AA458959 
AA459950 
AA460449 
AA463910 
AA464603 
AA464606 
AA465093 
AA465692 
AA476473 



AI858702 Hs.31803 
T79340 H2L22575 
AWB31469 H<k203213 
AA055829 Hs. 195701 
AF072873 Hs.114218 
H63789 Hs^96288 
AA 126311 Hs.9879 
AW803564 Hs^8885Q 
AW503733 Hs.9414 
W55946 HJL234883 
AL042506 Hs.21599 
AB040930 Hs.126085 
BE245294 Hs.180789 
NM.006283HS.173159 
BE396283 Hs.173987 
AA328102 Hs.24641 
AA233393 Hs.14992 
AW338625 Hi22120 
AI468789 Hs.347187 
AA172106 Hs.110950 
AW994032 Hs.8768 
AA814807 Hs.7395 
AK000046 Hs.343877 
AF053306 Hs.36708 
BE397354 Hs.324830 
AA256157 Hs.24115 
AA173942 Hs.326416 
AB040884 Hs.109694 
AB028956 Hs.12144 
AL389951 Hs.271623 
AL137751 Hs.263671 
BE621719 Hs.173802 
AA283044 Hs.25525 
AA426234 Hs.34906 
AI609530 Hs.279789 
AI928962 Hs.26761 
AI922821 Hs.32433 
AA299688 Hs.24183 
BE546245 Hs.301048 
BE546245 Hs.301048 
AW880358 Hs.339808 
AA406610 
AB033888 Hs.8619 
AI815486 Hs^43901 
AW855861 Hs.6025 
AB037850 Hs.16621 
AF031463 Hs.9302 
AB011169 Hs.20141 
W37943 Hs.34892 
AW135049 Hs.26285 
AB011095 Hs.16032 
AA525993 Hs. 173599 
AA083764 

AB037715 Hs.183639 
AA398859 Hs.18397 
AL079559 Hs.28020 
AI570189 HS25132 
AB020722 Hs.16714 
AW864696 Hs.301732 
R23324 Hs,41693 
AB033042 Hs.29679 
AA151520 

AA676939 Hs.69285 
AA301116 Hs.142838 
AKD00933 Hs^8661 
AW296451 Hs24605 
AA452379 
AF2652Q8 Hs.8740 
AW958037 Hs.286 
AW075485 Hs^86049 
AW754182 
AK000566 Hs.98135 
NNL015358H330985 
AA600357 HsJ239489 
AB014548 Hs31921 
WL0u7118Hs.171957 



ESTs, Weatty similar to N-WASP [H^apien 
B-ceU CLUtymphoma 6, member B (zinc fi 
ESTs 

ESTs, WeaWy simflar to ALU1J4UMAN ALU S 

frizzled (Drosophfla) homolog 6 

ESTs, Weakly similar to K1AA0638 protein 

ESTs 

Homo sapiens cDNA: FU22528 fe, done H 
KIAA1488 protein 

Homo sapiens cONA FU 12082 fe, clone HE 
KnjppaMIke factor 7 {ubiquitous) 
WAA1497 protein 
S164 protein 

transforming, acidic coOed-coD contain 
euxaryofic translation initiation tactor 
cytoskeieton associated protein 2 
hypofteflcal protein FU11151 
ESTs 

myotubuiarin related protein 1 
Rag C protein 

hypofhefal protein FU 10849 
hypoftefcal protein FU23182 
hypothetical protein FU20039 
budding uninhibited by benzimldazoles 1 
diptheria toxin resistance protein requi 
Homo sapiens cONA FU14176 fe, done NT 
Homo sapiens rnRNA; cDNA DKF2p554H1916 {f 
K1AA1451 protein 
KIAA1Q33 protein 
nucieoporin 50 kD 

Homo sapiens rnRNA; cONA DKFZp434l0812 (f 

K1AA0803 gene product 

hypothetical protein FU11323 

ESTs, WeaWy simfer to T17210 hypothefi 



DKFZP586LJ0724 protein 

ESTs 

ESTs 

sec13-Gke protein 

sec13-Gke protein 

hypothetical protein FU10120 

gb:zv15b10.s1 Soares_NhHMPu..S1 Homo sapi 

SRY (sex aetermining region Y)-box 18 

Homo sapiens cDNA RJ20738 fis, done HE 

Homo sapiens done 23767 and 23782 rnRNA 

DKF2P43411 16 protein 

phosdudn-Gke 

similar to S. cerevisiae SSM4 
WAA1323 protein 

Homo sapiens cDNA FU 10643 fe, done NT 
WAA0523 protein 

ESTs, WeaWy simflar to ALULHUMAN ALU S 

hypothetical protein MGC3178 

hypothetical protein FU10210 

rrypotheticai protein FU 23221 

K1AA0766 gene product 

KIAA0470 gene product 

Rho guanine exchange tactor (GEF) 15 

hypothetical protein MGC5306 

DnaJ (Hsp40) homdog, subfamily B, membe 

cotactor required for Sp1 transcriptions 

hypofoeficai protein MGC4485 

neuropfQnl 

nucleolar phosphoprotein Nopp34 

Homo sapiens cDNA RJ10071 fe, clone HE 

ESTs 

ESTs, Moderately similar to ALU7.HUMAN A 
SW1/SNF related, matrix associated, actl 
ribosomal protein L4 
pjiosphoserine aminotiansferase 
gb^C2-CT0321-13119W)11-c01 CT0321 Homo 
hypaffietica! protein FU2D559 
pannexinl 

TIA1 cytotoxic granute-assodated RNA-bi 
WAA0648 protein 

tnpte funcfionai dornain (PTPRF interact 
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106773 


AA478109 


AA478109 


Hs.188833 


105781 


AA478474 


AA330310 


Hs.24181 


105817 


AA480889 


D61216 


Hs.18672 


105846 


AA485223 


AB037744 


Hs.34892 


106848 


AA485254 


AA449014 


Hs.121025 


106856 


AA486163 


W58353 


Hs^85123 


418699 


AM96936 


BE539639 


Hs.173030 


107001 


AA598589 


A1926520 


Hs.31016 


442853 


AA598831 


AW021276 Hs.17121 


107054 


AA600150 


AJ076459 


Hs.15978 


107059 


AA608545 


BE614410 


Hs.23044 


107080 


AA609210 


AL122043 


Hs.19221 


107115 


AA610108 


BE379623 


Hs^7693 


107130 


AA620582 


AB033106 


Hs.12913 


107156 


AA621239 


AA137043 


Hs.9663 


107174 


AA621714 


BE122762 


Hs.25338 


130821 


AA621718 


AW513087 Hs.16803 


107190 


D19S73 


AA835401 


Hs.87860 


132626 


D25755 


AW504732 Hs.21275 


107217 


D51095 


AL080235 


Hs.35861 


332584 


D60272 


AA357679 


Hsl29423 


444655 


T08879 


AF088886 


Hs.11590 


107295 


T34527 


AA186629 


Hs.80120 


107299 


T40327 


BE277457 


Hs.30661 


107315 


TB2771 


AA316241 


Hs.90691 


107316 


763174 


T63174 


Hs.193700 


107328 


T83444 


AW959891 Hs.76591 


107334 


T93641 


T93597 


Hs.187429 


456340 


U48263 


U48263 


Hs.89040 


128636 


U49065 


U49065 


Hs.102865 


129938 


U79300 


AW003668 Hs.135587 


107375 


U88573 


BE011845 


Hs.251064 


130074 


U 93867 


AL038596 


Hs.250745 


107387 


W01094 


D86983 


Hs.118893 


iVXPtR 
10£UOU 


W0156B 


AL157433 


Hs.37706 


107426 


W26853 


W26853 


HS291003 


135388 


W27965 


W27965 


Hs.99865 


130419 


W36280 


AF037448 


Hs.155489 


107469 


W47063 


W47063 


Hs.94668 


434203 


W79060 


BE262677 


Hs.283558 


107506 


W88550 


AB028981 


Hs.8021 


132358 


X60486 


NMJ)03542Hs.46423 


107522 


XJ8931 


X78931 


Hs.99971 


456495 


Z14077 


NMJ)03403Hs.97496 




AA002147 


AA002147 


Hs.59952 




AA004711 


R75654 


Hs.164797 


107661 


AA010383 


AA010383 


Hs.60389 


107714 


AA01<T7R1 


AA015761 


Hs.60642 


107775 
IUi 1 /u 


AA018772 


AW008846 


Hs.60857 


IU/OO& 




AA021473 




in7R*w 

lu/ocra 


AA094835 


AW732573 


Hs.47584 


107914 


AA077229 


AA027229 


Hs.61329 


107935 


AA029428 


AA029428 


Hs.61555 


410196 


AA035143 


AI936442 


Hs.59838 


131461 


AAQ35237 


AA992841 


Hs.27263 


108007 


AA039S47 


AA039347 


Hs.61916 


108029 


AA040740 


AA040740 


Hs.62007 


108040 


AA041551 


AL121Q31 


Hs.159971 


108084 


AA045513 


AA058944 


Hs.1 16602 


108088 


AA045745 


AA045745 


Hs.62885 


108168 


AA055348 


AJ453137 


Hs.63176 


1W19 


AA056582 


AA679262 


Hs.14235 


108189 


AA05fifi97 


AW376061 


Hs.63335 


108100 


AA056748 


AA056746 


Hs.63338 


108203 


AAD57678 


AW847814 


Hs.289005 


108216 


AA058681 


AA524743 


Hs.44883 


108217 


AA058686 


AA058686 


Hs.62588 


108245 


AA062840 


BE410285 


Hs.89545 


108277 


AA054859 


AA064859 




108280 


AA065069 


AA065069 




108309 


AA069923 


AA069818 




108340 


AA070815 


AA069820 


Hs.180909 


108403 


AA075374 


AA075374 




108427 


AA076382 


AA076382 




108435 


AA078787 


T82427 


Hs.194101 


108439 


AA078986 


AA078986 





ESTs 
ESTs 
ESTs 

KIAA1323 protein 

chromosome 11 open reading frame 5 
Homo sapiens mRNA full length Insert cDN 
ESTs, Weakly similar to ALU8_HUMAN ALU S 
putative DNA bindhg protein 
ESTs 

K1AA1272 protein 

RAD51 (S. cerevislae) homotog (EcoORe 
hypothetical protein DKFZp566G1424 
peptidylprdyl isomerase (cydophSin)-! 
WAA1280 protein 

programmed ceD death 6-tnteracfing prot 
ESTs 

LUC7(S.cerovisbe)-t0ce 
ESTs 

hypoftetical protein FU1101 1 

DKFZP586E1621 protein 

ESTs; Weakly similar to macrophage iecti 

cathepsinF 

UDP^oetyMpha^>9aiactosamine:po1yp 

hypoftetical protein MGC4606 

nucleoprK)smin/nudeopiasm]n 3 

Homo sapiens mRNA; cDNA DKFZp586l0324 (f 

K1AA0887 protein 

ESTs 

prepronociceptin 
interieukin 1 receptor-fike 2 
Human clone 2352) mRNA sequence 
hkjh-mobffity group (nonhistone chromoso 
polymerase (RNA) III (DNA directed) (62k 
Melanoma associated gene 
hypothetical protein DKFZp434E2220 
riypothetical protein MGC4707 
epimorphin 

NS1 -associated protein 1 
ESTs 

hypothetical protein PR01B55 

KIAA1058 protein 

H4 histone family, member G 

zinc finger protein 272 

YY1 transcription factor 

EST 

hypometical protein FU13693 

ESTs 

ESTs 

ESTs 

gbze66c11.s1 Scares retina N2b4HR Homo 
potassium voltage-gated channel, delayed 
ESTs, Weakly similar© T16370 hypofoefi 
ESTs 

hypothefical protein FU 10803 

WAA1458 protein 

EST 

ESTs 

SWI/SNF related, matrix associated, acfi 
Homo sapiens, clone IMAGE:4154008, mRNA, 
ESTs 
ESTs 

hypoihefcal protein FU20008; K1AA1839 
ESTs, Moderately similar to A46010 X-fin 
EST 

Homo sapiens cDNA: FU21532 fis, done C 

ESTs 

ESTs 

proteasome (prosome, macropain) subunft, 
gb2m50f03^1 Stratagene fibroblast (937 
gb2m12e1U1 Stratagene pancreas (93720 
gbzm67e03j1 Stratagene neuroeprmeOum 
peroxreooxin i 

gb:zm87a01.s1 Stratagene ovarian cancer 
gb:zm91g08.s1 Stratagene ovarian cancer 
Homo sapiens cDNA: FU20869 fe, clone A 
gb2m92h01.s1 Stratagene ovarian cancer 
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108455 AA079393 
108469 AA079487 

108500 AA083207 

108501 AA083256 
108533 AA084415 
108562 AA085274 
108589 AA088678 
130890 AA100925 
432645 AA101255 
130385 AA126474 
108749 AA127017 

108807 AA129968 

108808 AA130240 
108833 AA131B66 
108846 AA132983 
108857 AA133250 
131474 AA133583 
108894 AA135941 
108941 AA148650 
108968 AA151110 
108996 AA155754 
109001 AA156125 
131183 AA156289 
109019 AA156997 

109022 AA157291 

109023 AA157293 
109068 AA164293 
109072 AA164676 
426981 AA167375 
130346 AA1 67550 
109146 AA176589 
109172 AA180448 
428438 AA187144 
129208 AA189170 
109222 AA192757 
109300 AA205650 
109481 AA233342 
109485 AA233472 
109516 AA234110 
109537 D80981 
109556 F01660 

109577 F02206 

109578 F02208 
109595 F02544 
109625 F03918 
428376 F04258 
109648 F04600 
109671 F08998 
109699 F09605 
109820 F11115 
109933 H06371 
110014 H10995 
110039 H11938 
110099 H16568 
110107 H16772 
110155 H18951 
110197 H20859 
110223 H23747 
110306 H38087 
110335 H40331 
110342 H40567 
110395 H46956 
110511 H56640 
110523 H57154 
110715 H96712 
110754 N2Q814 
428454 N25249 
431663 N27100 
134263 N39616 
110938 N48982 
110983 N51957 
111081 N59435 
111128 N64139 
431548 N66981 
111216 N68640 
437562 N69352 



AA079393 Hs.3462 
AA079487 

AA083207 Hs.68270 
AA083256 
AA084415 
AA100796 

AI732404 Hs.68846 
AJ907537 Hs.76698 
D14041 Hs.347340 
AW067800 Hs.155223 
AA127017 Hs.71052 
AI652236 Hs.49376 
AA045088 Hs.62738 
AF188527 Hs.61661 
AL1 17452 Hs.44155 
AK001468 Hs.62180 
L46353 Hs.2726 
AK001431 Hs.5105 
AA148650 

AI304870 Hs.188680 
AW995610 Hs.332436 
AI056548 Hs.72116 
AI611807 Hs.285107 
AA156755 Hs.72150 
AA157291 Hs.21479 
AA157293 Hs.72168 
AA164293 Hs.72545 
AI732585 Hs.22394 
ALM4675 Hs.173081 
H05769 Hs.188757 
AA176589 Hs.142078 
AA180448 Hs. 144300 
NMJX)1955Hs,2271 
AI587376 Hs.109441 
AA192833 Hs.333512 
AA418276 Hs.170142 
AA878923 Hs.289069 
BE619092 Hs.28465 
AI471639 Hs.71913 
AI858695 Hs.34898 
AI925294 Hs.87385 
F02206 Hs.295639 
F02208 HSJ27214 
AA078629 Hs.27301 
H29490 KS22697 
AF119665 Hs.184011 
H17800 Hs.7154 
R59210 Hs.26634 
H18013 Hs.167483 
AW016809 Hs.119021 
R52417 Hs.20945 
AL109666 Hs.7242 
H11938 Hs.21907 
R44557 Hs.23748 
AW151660 Hs.31444 
A1559626 Hs.93522 
AW090386 Hs.112278 
H19836 Hs.31697 
H38087 Ks.105509 
H65490 Hs.18845 
H40961 Hs.33008 
AA025116 Hs.33333 
H56640 Hs.221460 
A1040384 Hs.19102 
H96712 Hs.269029 
AW302200 Hs.6336 
U55936 Hs.184376 
NM_016569Hs.267162 
AW973443 Hs.8086 
N48982 Hs.38034 
NMJM5367Hs.102b7 
AI146349 Hs.271614 
AW505364 Hs.19074 
AI834273 Hs.9711 
AW139408 Hs.152940 
AB001636 Ks.5683 



cytochrome c oxidase sub unit Vile 
gDzm97fD8.s1 Stratagene colon HT29 (937 
EST 

gbzrr08g12.s1 Stratagene hNT neuron (937 
gbzn06g09.s1 Stratagene hNT neuron (937 
gbzm26c06.s1 Stratagene pancreas (93720 
ESTs 

stress-associated endoplasmic reticulum 
H-2K binding factor-2 
stanniocalctn 2 
ESTs 

hypothetical protein FU2J644 
ESTs 

ESTs, WeaWy simferto AF174605 1 F-box 

DKFZP586G1517 protein 

anfflin (Drosophfla Scraps homolog), act 

high-mobility group (nonhistone chromoso 

hypothetical protein FU10569 

gbzo09e06.$1 Stratagene neuroepKhelium 

ESTs 

EST 

hypothetical protein FU20992 similar to 
hypothetical protein FU13397 
ESTs 

ubinucteinl 

ESTs 

ESTs 

hypothetical protein FU10893 
K1AA0530 protein 

Homo sapiens, done MGC:5564, mRNA, comp 

EST 

EST 

endotheBn 1 
MSTP033 protein 
sim2ar to rat myomegaiin 
ESTs 

hypothetical protein FU21016 

Homo sapiens cDNA: FU21869 fe, done H 

ESTs 

ESTs 

ESTs 

Homo sapiens potassium channel subunit ( 

ESTs 

ESTs 

ESTs 

pyrophosphatase (inorganic) 

ESTs 

ESTs 

ESTs 

ESTs 

Homo sapiens done 24993 mRNA sequence 

Homo sapiens mRNA full length insert cDN 

histone acetyltransferase 

ESTs 

ESTs 

Homo sapiens mRNA for KIAA1647 protein, 

arrestin, beta 1 

ESTs 

CTL2gene 

ESTs 

ESTs 

ESTs 

ESTs 

ESTs, Weakly similar to organic anion tr 
ESTs 

K1AA0672 gene product 
synaptosomatesocfeted protein, 23kD 
TBX34so protein 

RNA (guanine-7-) rnettrytoansferase 
Homo sapiens cDNA FU12924 fe, clone NT 
MIL1 protein 
CGH 12 protein 

LATS (large tumor suppressor, Drosophfla 

novel protein 

ESTs 

DEAD/H (Asp-Gtu-Ala^sp/His) box poiypep 
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111399 R00138 


AW270776 Hs.16857 


111514 R07998 


R07998 




428744 R03929 


BE267033 


Hs.192853 


111574 R10307 


AI024145 


Hs.188526 


111804 R33354 


AA482478 


Hs.181785 


111831 R36G83 

III WW 1 1 %ft^*VWW 


R36095 


Hs.268695 


426773 R37938 

"fcWI I w IWI wwv 


NM_015556Hs.172180 


111904 R39330 


Z41572 




428371 R40816 


AB012193 Hs.183874 


112033 R43162 


R49031 


Hs.22627 


130987 R45698 


BE613269 


Hs-21893 


112300 R54554 


H24334 


HsJ26125 


112513 R68425 


R68425 


Hs.13809 


112514 R68568 

i n nwwwwu 


R68588 


Hs.183373 


112522 R68763 

l * ^Wfrfc l \ WW / WW 


R68857 


Hs.265499 


112540 R70467 


R69751 




428655 R735S5 

7LUVWV l\J WUJ 


H05769 


Hs.188757 


129534 R73640 


AK002126 


Hs.11260 


112597 R7837& 


R78376 


Ks29733 


112732 R92453 

i Kiwi rwt*tww 


R92453 


Hs34590 


451798 TQ3865 

~W 1 # 1 Vwwww 


BE297567 


Hs^7047 


112888 TT13877 


AW195317 Hs.107716 


131863 T10072 

lv 1 WWW 1 ivv<4 


AI656378 


Hs.33461 


112911 T10080 


AW732747 Hs.13493 


132215 T10132 


AL035703 


Hs.4236 


112931 T15343 

1 1 LWV 1 II V^TW 


T02966 


Hs.167428 


112984 T23457 

1 • tvW7 1 u/tv( 


T16971 


Hs.289014 


112998 T23555 


H11257 


Hs.22968 


133376 T23670 


BE618768 


Hs.7232 


113026 T23948 


AA376654 




113(170 T33464 

MOU7U liMHOt 


AB032977 


Hs.6298 


410781 T34413 


AI375672 


Hs.165028 


113074 T34611 


AK001335 


Hs.31137 




AA828380 


Hs.126733 


113179 T551R2 

1 It? 1 / 9 1 tW IOC 


BE622021 


Hs.152571 


113337 T77453 


T77453 


Hs.302234 


113421 T84039 

l i w*t*. i t u*rvM9 


A1769400 


Hs.189729 


113454 TB6458 


AI022166 


Hs.16188 


113481 T87693 


T87693 


Hs.204327 


453345 T89350 

"WWVTW 1 WWWWw 


AA302862 


Hs.90063 


113557 T90945 


H66470 


Hs.16004 


113559 T90987 

1 IWvWW 1 vWWW/ 


T79763 


Hs.14514 


113589 T91863 

1 1 WW W 1 W 1 Www 


AI078554 


Hs.15682 


113591 T91881 


T91881 


Hs.200597 


113619 T93783 


R08665 


Hs.17244 


113683 T96687 

I lt)U(M 1 wUUU7 


AB035335 


Hs.144519 


11369? T96Q44 


AL360T43 


HS.1793S 


113702 T97307 


T97307 




113717 T97764 

1 Iwf If 1 w# f Wt 


T99513 


Hs.187447 


113824 W48817 


A1631964 


Hs.34447 


113840 W58343 

1 l f » WWW - * W 


R72137 


Hs.7949 


113844 W59949 


AI369275 


Hs.243010 


113902 W74644 


AA340111 


Hs.100009 


113904 W74761 


AF125044 


Hs.19196 


113905 W74802 


R81733 


Hs33106 


113931 W81205 


BE255499 


Hs.3496 


113932 W81237 


AA256444 Hs.126485 


1319S5 W90146 


W79283 


Hs.35962 


114035 W92798 

1 l*fV<?w llwt/OO 


W92798 


HS.26S181 


114106 Z38412 


AW602528 




4«?7308 Z38709 

*tw 1 vvU m*MI ww 


AI416988 


Hs^38272 


114161 Z389Q4 


BE548222 


Hs.299883 


494949 Z39103 


AF052212 


Hs.153934 


4*57148 Z3S930 


AW069534 Hs^79583 


400037 739939 


AA251380 


Hs.10726 


432554 Z40012 

tJtiVvI fa «VW lb 


AW79813 


HSv278411 


114277 240377 

| |*Ti-l f "WW* » 


AI052229 


Hs^5373 


114304 Z40820 


AI934204 


Hs.16129 


114364 Z41680 


AL1 17427 


Hs.172778 


432620 AA005112 


AA777749 


Hs.5978 


123034 AA005432 


AA481157 


Hs.108110 


131881 AA010163 


AW361018 Hs.3383 


332421 AA026356 


A1909968 


Hs.108106 


114465 AA026901 


BE621056 Hs.131731 


451271 AA038867 


AK001644 


K&26156 


332498 AA044644 


AA303661 





ESTs 

gb^MSgl 1 .$1 Scares fetal liver spleen 

ublquffin-conjugafing enzyme E2G 2 (homo 

ESTs 

ESTs 

ESTs 

WAA0440 protein 

gb:HSCZYB122 normalized infant brain cDN 

cutGn4A 

ESTs 

hypoMical protein DKFZp761N0624 
ESTs 

hypothefcal protein RJ 10648 

src homology 3 dorna in-containing protein 

ESTs 

gbryi40a10^1 Soares placenta Nb2HP Homo 

Homo sapiens, done MGC:5564, mRNA. comp 

hypoftetical protein RJ11264 

EST 

ESTs 

hypofftetical protein FU 20392 
hypoffietical protein FU 22344 
ESTs 

Dke mouse brain protein E46 
WAA0478 gene product 
ESTs 

ESTs, Weakly similar to A43932 mucin 2 p 
Homo sapiens done 1MAGE:451939, mRNA se 
acetyl-Coenzyme A carboxylase alpha 
eukaryotic translation initiation factor 
WAA1151 protein 
ESTs 

protein tyrosine phosphatase, receptor t 
ESTs 

ESTs, Highly simOarto IGF-U mRNA-bind 

ESTs 

ESTs 

ESTs 

EST 

neurocakan delta 
ESTs 
ESTs 
ESTs 

KIAA0563 gene product 
hypothetical protein FU 13605 
T-ceS leukBrno/iyrnphoma 6 
DKFZP434H132 protein 
gb:ye53fr05.s1 Soares fetal Over spleen 
ESTs 
ESTs 

DKFZP586B2420 protein 

Homo sapiens cDNA FU14445 fis, clone HE 

acyt-Coenzyme A oxidase 1 , palmitoyl 

ubiquitin-conjugating enzyme HBUCE1 

ESTs 

hypofliefical protein MGC15749 
hypofrietical protein FU 12604; KIAA1692 
ESTs 
ESTs 

gb«C5-BT0562-26010CW)11-A02 BT0562 Homo 
inositol 1,4,5-tnphosphate receptor, ty 
hypothetical protein FU23399 
core-binding factor, runt domain, alpha 
CGf-81 protein 

ESTs, WeaWy simllarto ALU1 JflJMAN ALU S 

NCK-assodated protein 1 

ESTs, Weakly similar to T2041 0 hypomefi 

ESTs 

Homo sapiens mRNA; cDNA DKFZp566P013 (fr 
UM domain only 7 
DKFZP547E2110 protein 
upstream regulatory element binding prot 
tariscription factor 
hypothetical protein FU11099 
riypothefical protein FU10782 
(ymphocyte-specffic protein 1 
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431555 AA046426 
132944 AA054515 
114618 AA084162 
332509 AA085749 
114648 AA101056 
114658 AA102746 
132456 AA114250 
450847 AA126551 
132225 AA128980 
437197 AA129757 
114709 AA129921 
455926 AA133331 
114750 AA135958 
426805 AA136524 
114763 AA147044 
114767 AA148885 
114774 AA150043 
129388 AA151621 
457742 AA155743 
456200 AA156335 
130207 AA156336 
114798 AA159181 
114800 AA159825 
114828 AA234185 
114846 AA234929 
114848 AA234935 
114902 AA236359 
132271 AA236466 
114907 AA236535 
420170 AA236935 
132204 AA236942 
114928 AA237018 
132481 AA237025 
114932 AA242751 
314162 AA242760 
131006 AA242763 
114935 AA242809 
408908 AA243133 
437754 AA243495 
114957 AA243706 
114974 AA250848 
114977 AA250868 
114995 AA251152 
115005 AA251544 
417177 AA251792 
115026 AA252144 
115045 AA252524 
115068 AA253461 
133138 AA255522 
332668 AA255522 
115114 AA256468 
129584 AA256528 
115137 AA257976 
417187 AA258296 

115166 AA258409 

115167 AA258421 
436719 AA262077 
115239 AA278650 
115243 AA278766 
428419 AA280791 
115322 AA280819 
413303 AA280828 
115372 AA282195 
409962 AA283127 
130269 AA284694 
456570 AA291137 
332675 AA291708 
407864 AA293495 
115536 AA347193 
408799 AA398474 
115575 AA398512 
115601 AA4O0277 
434428 AA400896 
115683 AA410345 
115715 AA416733 
132952 AA425154 



AI815470 HS260024 

T96641 Hs.6127 

AW979261 Hs.291993 

AA128376 Hs.153884 
AA101056 

AA102383 Hs^49190 

AB011084 Hs.48924 
NWL003155Hs^5590 
AA128980 
W38586 

AA397651 Hs.301959 

AB018284 Hs. 153588 

AA887211 Hs.129467 

T19228 Hs.172572 

AA810755 Hs.102500 

AI859865 Hs. 154443 

AV656017 Hs.184325 

AA662477 Hs. 11 0964 

BE561824 Hs.273369 

AA768242 Hs.80618 

AF044209 Hs.144904 

AA159181 Hs.54900 

Z19448 Hs.131887 

AA252937 Hs.283522 

BE018682 Hs.166196 

BE614347 Hs.169615 

AW275480 Hs.39504 

AB030034 Hs.1 15175 

N29390 Hs.13804 

U43374 Hs.95631 

AA235827 Hs.42265 

AA237018 Hs.94869 

W93378 Hs.49614 

AA971436 Hs.16218 

BE041820 Hs.38516 

AF064104 Hs.22116 

H23329 Hs.290880 

BE296227 Hs.250822 

R60366 Hs.5822 

AW170425 Hs.87680 

AW966931 Hs.302649 

AW296978 Hs.877B7 

AA769266 Hs. 193657 

AI760825 Hs.153042 
NM„00445BHs.81452 

AA251972 Hs.188716 

AW014549 Hs.58373 

AW512260 Hs.87767 

AV657594 Hs.161161 

AV657594 Hs.181161 

AA527548 Hs7527 

AV656017 Hs.184325 

AW968304 Hs.56156 

AB011151 Hs.334659 

AF095727 Hs.287832 

AA749209 Hs.43728 

Y11192 Hs.5299 

BE251328 Hs.73291 

AA806600 Hs. 11 6665 
U49436 

L08895 Hs.78995 

AW836130 Hs.75277 

AW014385 Hs.88678 

U82671 Hs.57698 

F05422 Hs.168352 

AA286914 Hs.183299 
BE439944 

AF069291 Hs.40539 

AK001468 Hs.62180 

AA059412 Hs.47986 

AA393254 Hs.43619 

AA148984 Hs.48849 

D14540 Hs.1 99160 

AF255910 Hs.54650 

BE395161 Hs.1390 

A1658580 Hs.61426 



Cdc42 effector protein 3 

Homo sapiens cDNA: RJ23020 fis, clone L 

ESTs 

ATP binding protein associated with ceil 
gbzn25b03.s1 Stratagene neuroepitheCum 
tumor necrosis factor receptor superfami 
K1AA0512 gene product ALEX2 
stannfocaldn 1 

gb2o09a11s1 Stratagene neuroepithelium 
guanhe nucleotide binding protein (G pr 
proline synthetase co-transcribed (bade 
K1AA0741 gene product 
ESTs 

hypothetical protein FU20093 
hypothetical protein dJ511E16.2 
minlchromosornB maintenance deficient (S. 
CGI-76 protein 

hypotheScal protein FU23471 

uncharacterized hematopoietic stem/prog e 

hypothetical protein 

nuclear receptor co-represser 1 

serologically defined colon cancer anfig 

ESTs, Weaidy similar to T24396 bypothefi 

Homo sapiens mRNA; cDNA DKFZp434J1912 (f 

ATPase, Class I, type 8B, member 1 

hypothetical protein FU20989 

hypothetical protein MGC4308 

sterile-alpha motif and leucine zipper c 

hypothetical protein dJ462023.2 

Human normal keratinocyte mRNA 

ESTs 

ESTs 

ESTs 

KIAA0903 protein 

Homo sapiens, done MGC:15887, mRNA, com 

CDC14 (ceO division cycle 14, S. cerevi 

ESTs, Weaidy similar to ALU1.HUMAN ALU S 

serine/threonine kinase 15 

Homo sapiens cDNA: FU22120 fis, done H 

ESTs 

nudeosome assembly protein 1-Dke 1 

ESTs 

ESTs 

ESTs 

fatty-add-CoenzymeA figase, long-chain 

ESTs 

ESTs 

ESTs 

Homo sapiens cDNA FU14643 fis, done rVT 
ESTs 

small fragment nuclease 
CGJ-76 protein 
ESTs 

hypothetical protein MGC14139 
myelin protein zero-like 1 
hypothetical protein 

aldehyde dehydrogenase 5 family, member 
hypothetical protein RJ10881 
K1AA1842 protein 
K1AA1856 protein 

MADS box transcription enhancer factor 2 
hypoftetfcal protein RJ 13910 
ESTs, Weakly similar to Unknown [H^aple 
Target CAT 

nudeoporin-iike protein 1 

ESTs 

ESTs 

chromosome 8 open reading frame 1 
anffiin (DrosophUa Scraps homotog), act 
hypoftetica! protein MGC10940 
ESTs 

ESTs, Weakly similar to ALU4 JHUMAN ALU S 
myetoloVlyrnphc*J or nteWneage teukem 
juncfionaJ adhesion molecule 2 
proteasome (prosome, macropafn) subunfl, 
Homo sapiens rresenchymal stem ceQ prote 
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115819 
409124 
115895 
458073 
115962 
115957 
115974 
115985 
129254 
446730 
116095 
426856 
116210 
116213 
432645 
116265 
129334 
116274 
426002 
116331 
116333 
132994 
418538 
116391 
116394 
134531 
116417 
116429 
116439 
116459 
427505 
409633 
116541 
132557 
414964 
116571 
451522 
421919 
116643 
116661 
116715 
116729 
318709 
418999 
116773 
116780 
453884 
116819 
427278 
407833 
116844 
116845 
116892 
116925 
116981 
453133 
117031 
117034 
431129 
417861 
117280 
117344 
117422 
117475 
117487 
117937 
130207 
117549 
117683 
117710 
117791 
117822 
422544 
117895 
452259 
133057 



AA426573 

AA431418 

AA436182 

AA437099 

AA446585 

AA446887 

AA447224 

AA447709 

AA453624 

AA455044 

AA456045 

AA460454 

AA476494 

AA476738 

AA481422 

AA482595 

AA485084 

AA485431 

AA489638 

AA491000 

AA491250 

AA505133 

AA598447 

AA599243 

AA599574 

AA600153 

AA609309 

AA609710 

AA610068 

AA621399 

AA621752 

C21523 

D12160 

D19708 

D25801 

D45652 

D60208 

D80504 

F03010 

F04247 

F10966 

F13700 

H05063 

H16758 

H17315 

H22566 

H48459 

H53073 

H56559 

H57957 

H64938 

H64973 

H69535 

H73110 

H81783 

H86259 

H88353 



H88675 
H93708 
N22107 
N24046 
N27028 
N30205 
N30621 
N33258 
N33258 
N33390 
N40160 
N45198 
N48325 
N4S913 
N49394 
N50656 
N50721 
N53143 



AA486620 Hs.41135 
AW292809 Hs.50727 
AB033035 Hs.51955 
AA1 92669 Hs.45032 
AI636361 Hs.179520 
AI745379 H&42911 
BE513442 H&238944 
AA447709 H&268115 
AA252468 Hs.1098 
BE384932 Hs.64313 
AA043429 Hs.62618 
R19768 Hs.172788 
BE622792 Hs.172788 
AA292105 HSJ26740 
D14041 Hs.347340 
BE297412 Hs.55189 
AW157022 H&343551 
AI129767 Hs.182874 
BE514376 Hs.165998 
N41300 Hs.71616 
AF155827 H&203963 
AA1 12748 H&279905 
BE244323 H&85951 
T86558 Hs.75113 
NM_006033Hs.65370 
AI742845 Hs.110713 
AW499664 

AF191018 Hs.279923 
AA251594 Hs.43913 
R80137 Hs^02738 
AA361562 Hs.178761 
AW449822 Hs£5200 
D12160 Hs^49212 
AA1 14926 Hs.169531 
AA337548 Hs.333402 
D45652 H&211604 
BE565817 Hs26498 
AJ224901 Hs.109526 



A1367044 

R61504 

AL117440 



Hs.153638 

Hs.170263 
BE549407 Hs.115823 
R52576 HS.2B528Q 
NM.000121HS.89548 
AJ823410 Hs.343581 
H22566 Hs.63931 
AA355925 Hs.36232 
H53073 Hs.93698 
AL03142B Hs.174174 
AW955632 Hs.65665 
H64938 Hs.337434 
AA649530 Hs.348148 
A1573283 Hs.38458 



K73110 
N29218 



Hs.260603 
Hs.40290 



AC005757 Hs.31809 
H88353 Hs.347265 
U72209 
AL137751 
AA334551 
M18217 
R19085 
AI355562 
N302Q5 
N30S21 



Hs263671 

Hs.172129 

Hs.210705 

Hs.43880 

HS33740 

Ks.44203 



AF044209 Hs.144904 
AF044209 Hs.144904 



N33390 
N40180 
N45198 
N48325 



Hs.44483 

Hs.47248 
Hs.93956 
AA706282 HsS3963 
AB018259 HS.11B140 
AW450348 Hs33996 
AA317439 Hs^8707 
AA465131 Hs.64001 



erutornudn-2 

N-acetytgkjcosamin'tdase, alpha- (Sanfifl 

K1AA1209 protein 

ESTs 

hypotheflcal protein MGC10702 
ESTs 

hypometel protein FU10631 

ESTs, WeaWy similar to T08599 probable 

DKFZp434J1813 protein 

ESTs, WeaWy similar to AF257182 1 G-pro 

ESTs 

ALEX3 protein 
ALEX3 protein 

hypothetical protein MGC10947 

H-2K binding fector-2 

hypothetical protein 

hypothetical protein FU22584 

guanine nucleotide binding protein (G pr 

PAM rnRNA-binding protein 

Homo sapiens mRNA; cONA DKFZp586N1720 (f 

hypometfcal protein FU10339' 

clone HQ0310PRO0310p1 

exportin, tRNA (nuclear export receptor 

genera) transcription factor IIIA 



DEK oncogene (DNA binding) 
Human done 23826 mRNA sequence 
putative nudeofide binding protein, est 
P1BF1 gene product 

Homo sapiens cDNA: FU214# As, done C 
26S proteasome-assodated pad1 homotog 
ESTs 

polymerase (RNA) 111 (DNA directed) (155 
ESTs 

hypothetical protein MGC12760 
gb:HUMGS02&48 Human aduft lung 3* direct 
hypothetical proton FU21 657 
arte finger protein 198 
myetoW/lymphotd or mixed-lineage teukem 
gbyh16a03.s1 Scares infant brain 1NIB H 
tumor protein p53-Wnding protein, 1 
ribonuctease P, 40KD subunit 
Homo sapiens cDNA: FU22096 fis, done H 
erythropoietin receptor 
karyopherin alpha 1 (importin alpha 5) 
ESTs 

KlAA0186gene product 
EST 

K1AA0601 protein 

ESTs, WeaWy similar to S1 9560 proGne-r 
ESTs, WeaWy sirrrifar to A46010 X-finted 
gb:ns44fl)5.s1 NCLCGAP JUv1 Homo sapiens 
ESTs 

ESTs, Moderately similar to A47582 B-cq\ 
ESTs 



gfcyw21a0ls1 Morton Fetal Cochlea Homo 

YY1-associated factor 2 

Homo sapiens mRNA: cDNA DKFZp434l0812 (f 

sperm specific antigen 2 

Homo sapiens cDNA: FU21409 fe, doneC 

Homo sapiens cONA FU13182 fis, done NT 

ESTs, WeaWy similar to A46010 X4inked 

ESTs, WeaWy similar to 13B022 hypofteS 

ESTs 

nuclear receptor co-repressor 1 
nudear receptor co-repressor 1 
EST 

gfcyy44d02^1 Scjares.rnuffipie.sderosfe. 
ESTs, Highly similar to similar to Cdc14 
EST 
ESTs 

KIAA0716 gene product 
ESTs, Highly similar to SORl.HU MAN SORT1 
signal sequence receptor, gamma (transto 
Homo sapiens done 25218 mRNA sequence 
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65 



70 



75 



18690 
18766 
18793 
18817 
18844 
18919 
29558 
407604 



19021 
19039 
19063 

332622 
19111 
15115 
19146 

449224 



18103 
18111 
18129 
1B278 
18329 
16336 
17098 
18363 
18364 
18475 
18491 
18500 
18584 
56647 
18661 
18684 



N55326 
N55493 
N57493 
N62955 
N 63520 
N63604 
N64166 
N64168 
N64191 
N6S845 
N67135 
N67295 



19281 
19298 
26502 
19983 
19558 
29641 
19445 



19654 
19683 
19694 
19718 
10365 
19938 
20128 
20130 
20148 
20155 
51979 
20163 
20184 
20211 
20245 
20247 
20254 
20259 
20260 
20275 
20284 
17735 
22137 
20302 
20303 
20305 
20319 
408729 
20326 
33145 
20327 
20328 
20340 



N70777 

N713S4 

N71545 

N71571 

N74456 

N75594 

N79035 

N80279 

N91797 

N92454 

N94581- 

N94746 

N98238 

R02384 

R16833 

R41828 

R43203 

R46395 • 

R58863 

R78248 

T11483 

T16896 

T23820 

T30222 

W15275 

W38194 

W42414 

W49632 

W57613 

W57759 

W61118 

W65344 

W69216 

W69379 

W86728 

Z38499 

Z38630 

Z39494 

Z39623 

Z40071 

Z40174 

Z40182 

Z40904 

AA166965 

AA167500 

AA169599 

AA171724 

AA171739 

AA177105 

AA182626 

AA186324 

AA192099 

AA192173 

AA192415 

AA192553 

AA194851 

AA195520 

AA196300 

AA196549 

AA196721 

AA196979 

AA206828 



AA401733 Hs.184134 

N55493 

N57493 

N62955 Hs.316433 
N63520 

BE327311 Hs.47166 
AB017365 Hs.173859 
A1183838 Hs.48938 
N46114 Hs29169 
N66845 

AV647908 Hs.90424 
W32889 Hs.154329 
AW1 36928 

AI252640 Hs.110364 
AL137554 Hs.49927 
N71313 Hs.163986 
AW390601 Hs.184544 
N71571 Hs269142 
N74456 Hs.50499 
N75594 Hs.285921 
AI668658 Hs.50797 
AL035364 Hs.50891 
AW452696 Hs.130760 
AW580922 Hs.180446 
AW191962 Hs.288061 
N94746 Hs.274248 
N98238 Hs.55185 
AI160570 Hs.252097 
R16833 Hs.53106 
R10674 

T02865 Hs.328321 
AA214228 Hs.127751 
R58863 Hs.91815 
AW995911 Hs.299883 
T11483 

AI692322 Hs.65373 
NM_001241Hs.155478 
T10077 Hs.13453 
W55956. Hs.94030 
W38194 

AW081883 Hs.211578 
AA884471 Hs.90449 
R82342 Hs.79856 
W57759 

W65379 Hs.57835 
AA041350 Hs.57847 
W69216 Hs.92848 
AI287518 

AW014862 Hs.58885 
BE379320 Hs.91448 
AA045767 Hs.5300 
F02806 Hs.65765 
Z39623 Hs.65783 
F06972 Hs.27372 
AW082866 Hs.65882 
Z40182 Hs.65885 
Z40904 Hs.66012 
AW959615 Hs.111045 
AA167500 Hs.103939 
W90403 Hs.111054 
AW014786 Hs.192742 
AK000061 Hs.101590 
AA177105 Hs.78457 
AA179656 

AA188175 Hs.82506 
AJ236865 

AA837098 Hs.269933 
AI216292 Hs.96184 
AW295096 Hs.101337 
T57776 Hs.191094 
AA195764 Hs.72639 
AA196300 Hs^1145 
H94227 HsX592 
AK000292 Hs.130732 
AA923278 Hs390905 
AA206828 



ESTs 

gb:yv50dKLs1 Scares fetal fiver spleen 
gb:yy54c08.s1 Soaiesjrniltiple_scierosis_ 
Homo sapiens cONA FU1 1375 fis, done HE 
gfcyy62f01.s1 Soaresjnuftipte^sderosis. 
KT021 

frizzled (DrosophDa) homolog 7 
hypofoefcal protein FU21B02 
hypothetical protein FU22623 
gbza46c1 U1 Scares fetal Over spleen 
Homo sapiens cDNA: FU23285 fe, done H 
ESTs 

gkUUiei1.adpKiWUJl.s1 NCIJJGAPJSu 

pepfidytproryl somerase C (cydophifin 

protein kinase NYD-SP15 

Homo sapiens cONA: FU22765 fis, clone K 

Homo sapiens, clone IMAG&3355383, mRNA, 

ESTs 

EST 

ESTs, Moderately similar to T47135 hypot 
ESTs 

hypothetical protein 

myosin phosphatase, target subunit 2 

karyopherin (importin) beta 1 

collagen, type VIII, alpha 2 

hypothetical protein FU20758 

ESTs 

pregnancy specific beta-1 -glycoprotein 6 
ESTs, Moderately sirnilarto ALU INHUMAN A 
CSR1 protein 
EST 

hypothetical protein 
ESTs 

hypothetical protein FU23399 
gb£HR90049 Chromosome 9 exon Homo sapie 
ESTs, Weakly similar b T02345 hypothec 
cycfinT2 

hypoftefical protein RJ14753 
Homo sapiens mRNA; cDNA DKFZp586E1624 (f 
Empirically selected from AFFX single pr 
Homo sapiens cDNA: FU23037 fis, clone L 
Human clone 23908 mRNA sequence 
ESTs, Weakly similar ta S65657 aipha-1 C- 
gb:zd20g11.s1 SoaresJetaLheaiLNbHH19W 
ESTs 

ESTs, Moderately similar to ICE4_HUMAN C 
ESTs 

Homo sapiens mRNA; cDNA DKFZp586D0923 (f 
ESTs 

MKP-1 Sice protein tyrosine phosphatase 
bladder cancer associated protein 
ESTs 
ESTs 

BMX non-receptor tyrosine kinase 

ESTs 

EST 

EST 

ESTs 

EST 

ESTs 

hypothetical protem RJ12785 

hypothetical protein 

solute earner famDy 25 (mitochondrial 

gb:zp54e11.s1 Stratagene NT2 neuronal pr 

K1AA1254 protein 

zinc finger protein 1 48 (pHZ-52) 

ESTs 

ESTs 

uncoupling protein 3 (mitochondrial, pro 

ESTs 

ESTs 

hypoftetica) protein RG083MQ5.2 

Homo sapiens, done IMAGB2961368, mRNA, 

hypofteticai protein FU20285 

ESTs, Weakly similar b protease [H.sapi 

gbzq80b08.s1 Stratagene hNT neuron (937 
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417122 AA207123 AI906291 Hs.81234 imnuinoglobulin supertamily, member 3 

131522 AA214539 AI380040 Ks-239469 T1A1 cytotoxic granule^ssodated RNA-bi 

421787 AA226914 AA2270S8 Hs.108301 nudear receptor subfamily 2, group C.m 

120375 AA227260 AF028706 Hs.111227 Zic family member 3 {odd-paired Drosophi 

5 120376 AA227469 AA227469 gtezr18a07.s1 Stratagene NT2 neuronal pr 

12)390 AA233122 AA837093 Hs.111460 calcium/calmodiilm^eperttlent protein kin 

410804 AA233334 U64820 Hs.66521 MachadoJoseph disease (spinocerebellar 

434223 AA233347 AI825842 Hs.3776 zinc finger protein 216 

312771 AA233714 AA018515 Hs.264482 Homo sapiens rrtRNA; cDNA DKFZp761 A041 1 (f 

10 120396 AA233796 AA134006 Hs.79306 eukaryofic translation initiation factor 

120409 AA235050 AA235050 gb:zs38eG4.s1 Soares_NhHMPu_S1 Homo sap! 

120414 AA235704 AW1 37156 hte.181202 hypofhefca) protein FU10038 

120420 AA236031 AI128114 Hs.112885 spinal cordnderived growth factor-B 

120422 AA236352 AL133097 Hs.301717 hypotheficaJ protein DKF2p434N1928 
15 419326 AA236390 W94915 Hs.42419 ESTs 

120423 AA236453 AA236453 Hs. 18978 Homo sapiens cDNA: RJ22822 fis, done K 
120435 AA243370 AA243370 Hs.95450 EST 

120453 AA250947 AA250947 Hs.170263 tumor rjroteh poinding protein, 1 

120455 AA251083 AA251720 Hs. 104347 ESTs, Weakly similar to ALUCJiUMAN 111! 

20 120456 AA251113 AA488750 Hs.88414 BTB and CNC homology 1 , basic leucine zi 

120473 AA251973 AA251973 Hs.269988 ESTs 

128922 AA252023 AI244901 Hs.9589 ubiquilin 1 

120477 AA252414 AA252414 Hs.43141 DKFZP727C091 protein 

120479 AA252650 AF006689 Hs. 11 0299 mitogen-actfvated protein kinase kinase 

25 120488 AA255523 AW952916 Hs.63510 WAA0141 gene product 

120510 AA258128 AI796395 Hs.111377 ESTs 

120527 AA262105 AA262105 Hs.4094 Homo sapiens cDNA FLJ 14208 fis, clone NT 

120528 AA262107 AI923511 Hs/1 04413 ESTs 

120529 AA262235 AI434823 Hs.104415 ESTs 

30 12)541 AA278298 W07318 Hs.240 M-ptese phosphoprotem 1 

120544 AA276721 BE548277 Hs.103104 ESTs 

120562 AA280036 BE244580 Hs.342307 hypoftefical protein FU10330 

120569 AA260648 AA807544 H&24970 ESTs, Weakly similar to B34323 GTP-bindi 

120571 AA280738 AB037744 Hs.34892 WAA1323 protein 

35 '120572 AA280794 H39599 Hs.294008 ESTs 

129434 AA280837 AW967495 Hs.186644 ESTs 

130529 AA280885 AA178953 Hs.309648 gb:zp39e03^1 Stratagene musde 937209 H 

120575 AA280934 AW978022 Hs.238911 hypothetical protein DKFZp762E1511;KIAA 

409339 AA281535 AB020686 Hs.54037 ectonudeotide pyrophosphatase/phosphodi 

40 120591 AA281797 AF078847 Hs.191356 general transcription factor IIH, polype 

120593 AA282047 AA748355 Hs.193522 ESTs 

430275 AA283002 Z11773 Hs.237786 zinc finger protein 187 

440303 AA283709 AA306166 Hs7145 calpain7 

120609 AA283902 AW978721 Hs.266076 ESTs, Weakly similar to A46010 X-finked 

45 409702 AA284108 AI752244 eukaryotic transtatton elongation factor 

456870 AA284109 A1241084 Hs. 154353 nonselecSve sodium potassium/proton exc 

132614 AA284371 AA284371 Hs. 118084 similar to rat nudear ubiquitous casein 

458750 AA2B4744 AA115496 Hs.336898 Homo sapiens, Similar to RIKEN cDNA 1810 

135376 AA284784 BE617856 Hs.99756 mitoctondftal ribosome recycCng factor 

50 120621 AA284840 AW961294 Hs.143818 hypothetical protein FU23459 

452279 AA286844 AA286844 Hs.61260 hypcttefca! protein FU13164 

332484 AA287032 AW172431 Hs.13012 ESTs 

120644 AA287038 AI869129 Hs.96616 ESTs 

120660 AA287546 AA286785 Hs.99677 ESTs 

55 135370 AA287553 BE622187 Hs.99670 ESTs, Weakry similar to 138022 hypotheti 

120661 AA287556 AA287556 Hs.263412 ESTs, WeaWy similar to ALUB_HUMAN HI! 
429828 AA287564 AB019494 Hs^25767 IDN3 protein 

452291 AA291015 AF015592 Hs.28853 CDC7 (cell dMsion cyde 7, S.cerevisi 

120699 AA291716 AI683243 Hs.97258 ESTs, Moderately similar to S29539 ribos 

60 100690 AA291749 AA383256 Hs.1657 estrogen receptor 1 

120726 AA293656 AA293655 Hs.21198 ESTs 

120737 AA302430 AUJ49176 Hs^2223 ctiordin-Gke 

120745 AA302809 AA302809 gb£ST10426 Adipose tissue, white I Homo 

443574 AA302820 U83993 Hs,321709 purinergjc receptor P2X, Bgand-gated to 

65 120750 AA310499 A1191410 Hs.96693 ESTs, Moderately similar to 2109260A B c 

120761 AA321890 AA321690 branched chain keto add dehydrogenase E 

120768 AA340589 AA340589 Hs.104560 EST 

120769 AA340622 AI769467 Hs.9475 ESTs 

135232 AA342457 AL038812 Hs.96800 ESTs, Moderately sirrribr to ALU7_HUMAN A 

70 120793 AA342864 AA342864 Hs^6812 ESTs 

120796 AA342973 A1247356 Hs.96820 ESTs 

120809 AA346495 AA346495 gb:EST52657 Fetal heart II Homo sapiens 

332633 AA347573 AL120071 Hs.48998 fibronecfin leucine rich fjaremembrane p 

120825 AA347614 AI280215 Hs.96885 ESTs 

75 120827 AA347717 AA382525 Hs.132967 Human EST done 122887 marrrrer transpose 

120839 AA348913 AA348913 gb:EST55442 Infant adrenal giand P Homo 
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120850 AA349S47 


AA349647 


Hs.96927 


Homo sapiens cONA FU 12573 fs, clone NT 


120852 AA349773 


AA349773 


Hs.191564 


ESTs 


128852 AA350541 


R40622 


Hs.106601 


ESTs 


135240 AA357159 


AA357159 


Hs.96986 


EST 


120870 AA357172 


AA357172 


Hs .292581 


ESTs, Moderately similar to A1U1_HUMAN A 


120894 AA370132 


AA370132 


Hs.97063 


ESTs 


435737 AA370472 


AF229839 


Hs.173202 


l-kappa-B4nteracfing Ras-fike protein 1 


12)897 AA370867 


AA370867 

rVwr www I 


Hs.97079 


ESTs. Moderated similar to AF174605 1 F 


120915 AA377298 


AL1 35556 


Hs 97104 


ESTs 


120935 AA3839Q2 


AL048409 


Hs.97177 


ESTs-WeaklvslntofoALUI HUMAN ALUS 


120936 AA385934 


AA385934 


Hs 97184 


EST Hbhtv simitar to (defline not aval 


120937 AA38S255 


AA386255 


Hs 97186 


EST 


120938 AA386260 


AA386260 


Hs 104632 


EST 


417632 AA386266 


R20855 


Hs.5422 


oivcofirnTAin MhR 

MtjwwpiWiwUI IWRrffcrf 


120960 AA398014 


AA3S8014 


Hs.104684 


EST 


120985 AA398222 


AJ219896 


Hs.97592 


ESTs 


120988 AA398235 


AA398235 


Hs.97631 


ESTs 


121008 AA398348 


AA398348 


Hs. 130546 


Human DNA sequence from clone RP1 1-251J8 


121029 AA398482 


AA398482 


Hs.97641 


EST 


121032 AA398504 


AA393037 


Hs.1 61798 


ESTs 


121033 AA398505 


AA398505 


Hs.97360 


ESTs 


121034 AA398507 


AL389951 


Hs 271623 

t lw«4f ■ U4v 


nudeonorin 50kO 


121035 AA398523 


AA398523 


Hs.210579 


ESTs 


121058 AA398625 

i & i www nnwwuw 


AA398625 


Hs 97391 


ESTs 


121050 AA398632 


AA398632 


Hs.97395 

1 lw»w * WWW 


ESTs 


121061 AA398633 




Hs 97396 


ESTs 


121091 AA398894 




Hs 97657 


FSTY Mrriprateh/<drnnartoALU8 HUMANA 


121092 AA398895 


AA398895 


Hs.97658 


EST 


121094 AA398900 


AA402505 




ab*2l62h10 rl Soares testis NHT Homo san 


121096 AA398904 


AA398904 


Hs 332690 


ESTs 


121115 AA399122 


AA398187 


Hs 104682 


ESTs Weaklv simitar to mitochondrial ci 

WW 1 Of I VwwitlJ mIJIUICU iw 1 J UIWW4 *w4 fUl IOJ \M 


121121 AA399371 


AA399371 


Hs.189095 


similar to SALL1 (sal (Orosophlla)-I0ce 


121122 AA399373 


AJ126713 


Hs.1 92233 


ESTs, Highly similar to T00337 hypotheti 


121125 AA399441 


AL042981 


Hs.251278 


K1AA1201 protein 


121151 AA399636 


AA399636 


Hs.143629 


ESTs 


121153 AA399640 

1 1- 1 l«M nnwWv7V 


AA399640 


Hs 97694 


ESTs 


121163 AA399680 


A 1676(162 


Hs 111902 


ESTs 


121176 AA400080 


AL1 21523 

ki 


Hs 97774 


ESTs 


121192 AA400262 


AA400262 


Hs.190093 


ESTs 


121223 AA400725 


AJ002110 


Hs.97169 


ESTs Wealdv slmBar to d J667H1 2 2 1 fH 


121227 AA400748 

\e\fjti nmuu/to 


AA400748 


Hs.97823 


Homo qaotens mRNA* cDNA DKFZd434D024 ffr 


121231 AA400780 


AA814948 


Hs.96343 


ESTs Weaktv simitar to ALUC HUMAN III] 


121278 AM01631 

1 ILf V nrTTV ivw 1 


AAQ37121 


Hs.98518 


Homo saofens cDNA FU1 1490 fis clone HE 


121279 AM01688 


AA292873 


Hs.177996 


ESTs 


121282 AA401695 


AA401695 


Hs.97334 


ESTs 


121299 AA402227 


AA402227 


Hs.22826 


trooomodulin 3 fuhiouifous) 


121301 AA402329 


NM.006202HS.89901 


phosfrfiodiesterase 4A, cAMP-specitlc (dun 


121302 AA402398 


AA402587 


Hs.325520 


LAT1-3TM oroteln 


121304 AA402449 


AA293863 


Hs.97316 


EST 


121305 AA402468 


AA402458 


KS291557 


ESTs 


1 id 721 AA4Q3268 


AK000112 


Hs.89306 


hvnnthptiral nmtprn FLI20105 


121T21 AA4G3314 


AA291411 


Hs.97247 


ESTs 


101 704 AA404229 


AA404229 


Hs.97842 


EST 


AAM70 AAAOiSfifl 


AI768623 


Hs.108264 


CO IS 


1^1074 AA4/W771 


U16125 


Hs.181581 


nhrtama+o rworrfnr innn+mrvr k-prnato 


121 W AA4fl5fi?6 


AA405026 


Hs.193754 


CCTe 


121348 AA405182 


AA405182 Hs.97973 


ESTs 


121350 AA4G5237 


AA405237 




abrzf06p10 s1 NCI CGAP GCB1 Homo san'tens 


121400 AA406061 


AA406061 


Hs.98001 


EST 


121402 AA408053 


AA406063 


Hs.98003 


ESTs 


121403 AA406070 


AA406070 


Hs.98004 


EST 


121408 AA406137 


AA406137 


Hs.98019 


EST 


121431 AA4C6335 


AA035279 


Hs.176731 


ESTs 


121471 AA4118D4 


AA411804 


Hs^61575 


ESTs 


121474 AA411833 


AA402335 


Hs.1 68760 


ESTs. Htahlv similar to Trad tH^aoiensI 


121*126 AA412219 


AW665325 Hs.98120 


ESTs 


121*v?0 AA412259 


AA778658 


Hs.98122 


ESTs 


121558 AA412497 


AA412497 




gbzt95g12s1 Soares_tesfisJW Homo sap 


121559 AA412498 


AI192044 


Hs.104778 


ESTs 


121584 AA416586 


AI024471 


Hs.98232 


ESTs 


121609 AA416867 


AA416857 Hs£8185 


EST 


121612 AA416874 


AA416874 Hs.98168 


ESTs 


121737 AA421133 


AA421133 Hs.104671 


erythrocyte transmembrane protein 


121740 AA421138 


AA421138 Hs.143835 


EST 


436032 AA422079 


AA1 50797 Hs.109276 


tatexln protein 


121784 AA423837 


T90789 


Hs.94308 


RAB35, member RAS oncogene fernfly 
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121802 AA424328 

121803 AA424339 
135286 AA424469 
332778 AA424469 
121806 AA424502 
129517 AA425004 
121845 AA425734 
121853 AA425887 
121891 AA426455 
121895 AA427336 
121899 AA427555 

121917 AA428218 

121918 AA428242 

121919 AA428281 

121941 AA428865 

121942 AA428994 
121970 AA429666 
121993 AA430181 
418706 AA430184 
122022 AA431293 

122050 AA431478 

122051 AA431492 
122055 AA431732 
122105 AA432278 

•122125 AA434411 
135235 AA435512 
122162 AA435598 
422072 AA435711 
415106 AA435815 
122186 AA435842 
122235 AA436475 
412970 AA436489 
419288 AA442060 
122310 AA442079 
122334 AA443151 
122382 AA446133 
122425 AA447145 
122431 AA447398 
122450 M447643 
426284 AA447742 
122477 AA448226 
122500 AA448825 
122522 AA449444 
122536 AA450087 
122538 AA450211 
122540 AA450244 
122560 AA452123 
421919 AA452155 
122562 AA452156 
122585 AA453036 
122608 AA453526 

122635 AA454085 

122636 AA454103 
122653 AA454642 
122660 AA454935 
122703 AA456323 
122724 AA457395 
122749 AA458850 
122772 AA459662 
430242 AA459668 
429838 AA459679 
122777 AA459702 
135362 AA460017 
122798 AA460324 
122837 AA461509 

122860 AA464414 

122861 AA464428 
122910 AA470084 
132899 AA476606 
122967 AA478521 
422845 AA478523 
123009 AA479949 
128917 AA481252 
123081 AA485351 
123133 AA487264 
123184 AA489072 



AI251870 Hs.188898 
AI338371 Hs.157173 
AW023482 Hs.97849 
AWQ23482 Hs.97849 
AA424313 Hs.98402 
AW972853 Hs.112237 
AI732692 Hs.165066 
AA425887 Hs.98502 
AA426456 Hs.98469 
AM27396 

R55341 H&50421 
AA406397 Hs.139425 
BE274689 Hs.184175 
AA428281 Hs.98560 
AA428865 Hs.98563 
AW452701 Hs.293237 
AA429666 Hs.98617 
AW297880 Hs.98661 
U73524 Hs.87465 
'AA431293 Hs.98716 
A1453076 

AA431492 Hs.98742 
AA431732 Hs.98747 
AW241685 Hs.98699 
AK000492 Hs.98806 
AW298244 Hs.266195 
AA628233 Hs.79946 
AB018255 Hs.111138 
U40763 Hs.77965 
AA398811 Hs.104673 
AA436475 Hs.1 12227 
AB026436 Hs.177534 
AA256106 Hs.87507 
AW192803 Hs.98974 
BE465894 Hs.98365 
AA446440 Hs.98643 
AB007859 Hs.100955 
AA447398 Hs.99104 
AA447643 Hs.112095 
AJ404468 Hs^84259 
AA448226 Hs.324123 
AA448825 Hs.99190 
AA299607 Hs.98969 
AF060877 Hs.99236 
AA450211 Hs.99239 
AA476741 Hs.98279 
AW392342 Hs.283077 
AJ224901 Hs.109526 
AA452156 

AI681654 Hs.170737 
AA453525 Hs.143077 
AA454085 

AW651706 Hs.99519 
AW009166 Hs.99376 
AJ816827 Hs.180069 
AA456323 Hs.269369 
AA457395 Hs.99457 
AA458850 Hs.293372 
AW1 17452 Hs.99489 
U66669 Hs.236642 
AW904907 Hs.30732 
AKD01022 Hs.214397 
AA978128 Hs.99513 
AW366286 Hs.145696 
AA461509 Hs293565 
AA464414 

AA335721 Hs.213628 
AM70084 Hs.98358 
AA476606 Hs.59666 
AA806187 Hs.289101 
AA317841 Hs.7845 
AA535244 Hs.78305 
AJ365215 Hs.206097 
A1815486. Hs.243901 
AA487264 Hs.154974 
BE247767 Hs.18166 



ESTs 
ESTs 
ESTs 
ESTs 
ESTs 
ESTs 

ESTs, Moderately similar to ALU2_HUMAN A 

hypofoetfcai protein FU14303 

ESTs 

gb:zw33a02^1 Scares ovary tumor NbHOT H 

KlAAQ203gene product 

ESTs 

chromosome 2 open reading frame 3 

EST 

ESTs 

ESTs 

EST 

ESTs 

ATP/GTP-binding protein 

ESTs, Moderately simSar to T42650 hypot 

ELAV (embryonic teffial, abnormal vision, 

EST 

EST 

ESTs 

hypothetical protein 
ESTs 

cytochrome P450, subfamily XIX (aromatiz 
KIAA0712 gene product 
peptidyl-prolyi isomerase G (cydophiBn 
ESTs 

membrarie-associated nucleic add binding 
dual specificity phosphatase 10 
ESTs 

EST s, Weakly similar to S65824 reverse t 
ESTs, Weakly simitar to IB4DJWMAN NADP- 
ESTs 

83AA0399 protein 
ESTs 

hypofoefcal protein DKFZp434F1819 

dynein, axonemal, heavy polypeptide 9 

ESTs 

ESTs 

ESTs 

regulator of G-protein signalling 20 
ESTs 

ESTs, Weakly similar to A43932 mucin 2 p 
centrosomal P4.1 -associated protein; unc 
zinc finger protein 198 

gb:zx29c03.s1 Sc>are5jotaUetus_Nb2HF8_ 

hypothetical protein RJ 23251 

ESTs 

gb:zx33a0B.s1 Soares_totaUetus_Nb2HF8_ 

hypothetical protein RJ14007 

ESTs 

nuclear respiratory factor 1 

ESTs 

ESTs 

EST s, Weakly similar to B34087 hypotheti 
ESTs 

34iydroxyisobutyryKtenzyme A hydrolase 

hypothetical protein FU13409; KIAA1711 

hypothetjcal protein FU10160 similar to 

ESTs, Weakly similar to T17454 dfephanou 

spBdrg factor (CC1.3) 

ESTs, Weakly similar to putative p150 fH 

gfczx78g01 .si Scares ovary tumor NbHOT H 

ESTs 

ESTs 

SMAD in the anusense orientation 
glucose regulated protein, 58kD 
hypothefica] protein MGC2752 
RAB2, member RAS oncogene family 
oncogene TC21 

Homo sapiens cONA RJ20738 fis, done HE 
Homo sapiens mRNA; cDNA DKFZp667N064 (fr 
WAA0870 protein 
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332467 AA489S30 

123233 AA490225 

123234 AA490227 
123236 AA490255 

5 123255 AA490890 

430015 AA490916 

446892 AA490925 

123259 AA490955 

123284 AA495612 

10 123286 AA495824 

123315 AA49S369 

457397 AA504125 

433049 AA521473 

123421 AA598440 

15 123449 AA598899 

426981 AA599244 

409986 AA599694 

123497 AA600037 

123604 AA609135 

20 123712 AA609684 

123731 AA609839 

123800 AA620423 

123841 AA620747 

123929 AA621364 

25 123978 C20653 

133184 D20085 

132835 D20749 

435147 D51285 

128595 D59972 

30 124029 F04112 

124057 F13604 

449316 H01662 

130973 H05135 

124106 H12245 

35 124136 H22842 

124165 H30894 

429627 H43442 

124178 H45996 

129948 H69281 

40 452114 H69485 
124+0826254 

129056 H70627 

427580 H73260 

426793 H77531 

45 124274 H80552 

129078 H80737 

457658 H93412 

124315 H94892 

437712 H95643 

50 124324 H96552 

452933 H97146 

132231 H99131 

421877 H99462 

443m H99B37 

55 132963 N22140 

420473 N22197 

417381 N23756 

130365 N24134 

456610 N24195 

60 439311 N26739 

124383 N27098 

124387 N27637 

129341 N 33090 

419793 N35967 

65 124433 N39059 

124441 N46441 

132338 N48270 

436575 N48365 

124466 N51316 

70 408048 N51499 

124483 N53976 

124484 N54157 

124485 N54300 
124494 N54831 

75 129200 N59849 

124527 N62132 



NWL014700HS.119004 
AW974175 Hs.151675 
NNL001938HS.16697 
AW968504 Hs.1 23073 
AA830335 Hs.1 05273 
AW768399 Hs.1 06357 
AF084535 Hs.22464 
AI744152 Hs^83374 
AA488988 Hs.293796 
AA495824 Hs.188822 
AA496359 

AW969025 Hs.1 09154 
AU076668 Hs.334884 
AA598440 Hs.291154 
AL049325 Hs.1 12493 
AL044675 Hs.1 73081 
NIVL014777H$^7730 
AA765256 Hs.1 351 91 
AA609135 Hs-293076 
AA609684 

AA609839 Hs.334437 
AA620423 Hs.1 12862 
AA620747 Hs.1 12896 
AA621364 Hs.112981 
T89832 Hs.170278 
AA001021 Hs.6685 
Z83844 Hs.5790 
AL133731 Hs.4774 
NKL003478HS.101299 
F04112 Hs.312553 
AA902384 Hs.73853 
AI609045 Hs.321775 
AI638418 Hs.1440 
H12245 

H22842 Hs.101770 
H30039 Hs, 107674 
NM_015340Hs.2450 
BE463721 Hs,97101 
AI537162 Hs.263988 
N22687 Hs.8236 
H69899 H69899 
AI769958 Hs.108336 
AK001507 Hs.44143 
X89887 Hs.1 72350 
H80552 Hs.102249 
AI351010 Hs.102267 
AW952124 Hs 13094 
NNL005402HsJ288757 
X04588 Hs.85844 
H96552 Hs.159472 
AW391423 Hs.288555 
AA662910 Hs.42635 
AW250380 Hs.109059 
AA094538 Hs.272808 
AA099693 Hs.34851 
AL1 18782 Hs.300208 
AF164142 Hs.82042 
W56119 Hs.155103 
AF172066 Hs.1 06346 
BE270568 Hs.151945 
N27098 Hs.102463 
N27637 Hs.109019 
AI193519 Hs.226396 
AI364933 Hs.1 58913 
AA280319 Hs.288840 
AW450481 Hs.161333 
AA353868 Hs.1 82982 
A1473114 

R10084 Hs.1 13319 
NM 007203HS.42322 
A1821780 Hs.179864 
H66118 Hs.285520 
AB040933 Hs. 15420 
N54831 Hs^71381 
N59849 Hs.13565 
N79264 Hsl269104 



K1AA0665 gens product 

ESTs. Weakly similar to MAPB.HUMAN MICRO 

dovm^fegubtDroftranscripfion 1, TBP-b 

CD02-related protein kinase 7 

ESTs 

ESTs 

epilepsy, progressive myoclonus type 2, 
ESTs, Weakly similar to CA15.HUMAN COULA 
ESTs 

ESTs. Weakly similar to A46010 X^nked 
gbzv37d10.s1 Scares ovary tumor NbHOT H 
ESTs 

SEC10 (S. cerevkiae)-fike 1 

EST. Weakly similar to 138022 hypofhefc 

Homo sapiens mRNA; cDNA DKFZp564D036 (fr 

WAA0530 protein 

K1AA0133 gene product 

ESTs, Weakly similar to unnamed protein 

ESTs 

Homo sapiens cDNA: FU21543 fe, done C 

gbaeS2K)1.5l Stratagene tung carcinoma 

EST 

ESTs 

ESTs 

ESTs 

thyroid hormone receptor interactor 8 

hypothefical protein dJ37E16.5 

Homo sapiens mRNA; cDNA DKFZp761C1712 (f 

cuEn5 

gb:HSC2JH062 normalized infant brain cON 
bone morphogenefic protein 2 
hypothetical protein DKFZp434D1428 
DEAD/H (Asp-G(u-AiaAsp/His) box poiypep 
gb:ym17a12.r1 Scares infant brain 1NIB H 
EST 
ESTs 

leucyHRNA synftetase, mitochondrial 
putative G protein-coupled receptor 
ESTs 
ESTs 

gb:yu70c12.s1 Weizmann Olfactory EpHhel 
ESTs, Weakly similar to ALUE_HUMAN I'll 
Homo sapiens done FLB6914 PR01821 mRNA, 
HIR (histone ceO cycle regulation defec 
EST 



presenSins associated rhombokWike pro 
v-ral simon leukemia viral oncogene horn 
neurotrophic tyrosine kinase, receptor, 
Homo sapiens cDNA: FU22224 fts, clone H 
Homo sapiens cDNA: FU22425 6s, clone H 
hypothetical protein DKFZp434K2435 
mitochondrial ribosoma) protein L12 
putative transcription regulation nuclea 
epsllon-tubulin 

Sec23-interacfjng protein p125 
soJute earner family 23 (nudeobase tza 
eukaryotic translation initiation factor 
retrnoic acid repressible protein 
mftochondrial ribosomal protein L43 
EST 
ESTs 

frypoffiefjcal protein FU11126 

serine/threonine kinase 24 (Ste20, yeast 

PR01575 protein 

ESTs 

gotgin-67 

ESTs 

kinesin heavy chain member 2 
A kinase (PRKA) anchor protein 2 
ESTs 

ESTs, Weakly ssmilar to 21 09260A B ceS 
K3AA1 500 protein 

ESTs, WeaWy similar to 138022 hypotheti 
Sam68-Bke phosphotyrosine protein, T-ST 
ESTs 
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124532 N62375 
133213 NS138 
124539 N83172 
12919S N63787 

124575 N68168 

124576 N68201 

124577 N68300 

124578 N68321 
124593 N69575 
128501 N75G07 
332434 N75542 
128473 N90056 
128839 M91248 
124852 N92751 
133137 N93214 
124671 N99148 
133054 R07876 
425266 R10865 
124720 R11056 
124722 R11488 
128944 R23930 
132965 R26589 
426504 R37588 
438828 R37613 
124757 R38398 
124762 R39179 
124773 R40923 
135266 R41179 
427961 R41294 
414303 R42307 
128540 R43189 
124785 R43306 

124792 R44357 

124793 R44519 
124799 R45088 
124812 R47948 
124821 R51524 
424123 R54950 
124835 R55241 
124845 R59585 
124847 R60044 
440630 R60872 
124861 R 66690 
332503 R67266 
124879 R73588 
124892 R79403 
124906 R87647 
124922 R93622 

124940 R99599 

124941 R99612 
124943 T02888 
124947 T03170 
124954 T10465 
456862 T15418 
410653 T15597 
418133 T15652 
440014 T16898 
131082 726644 
124980 T40841 
124984 T47566 
124991 T50116 
457222 T50145 
125000 138615 
132932 T59940 
444484 T63595 

125008 T64891 

125009 T64924 
445384 T64933 

125017 T68875 

125018 T69027 
125020 T69924 
437871 T70353 
134204 T79780 
125050 T79951 
125052 T80174 
125054 T80622 



N62375 Hs.102731 
AA903424 Hs.6786 
D54120 Hs.146409 
BE296313 Hs^65592 
N68168 
N58201 

N68300 Hs.138465 
N68321 Hs.231500 
N69575 Hs.102788 
AL133572 Hs.1 99009 
A1680737 Hs.289068 
T78277 Hs.100293 
AW582962 Hs.102897 
W19407 HsJ»62 
AB002316 Hs.65746 
AK001357 Hs.102951 
AA464836 Hs.291079 
J00077 Hs.155421 
R05283 

T97733 Hs.185685 
AL137586 Hs^2763 
AI248173 Hs.191460 
AW162919 Hs.170160 
AL134275 Hs.6434 
H11368 Hs.141055 
AA553722 Hs.92096 
R45154 Hs.338439 
R41179 Hs.97393 
AW293165 Hs.143134 
NWL004427HS.165263 
AW297929 Hs.328317 
W38537 Hs.280740 
R44357 Hs.48712 
R44519 
R45088 

R47948 Hs.188732 
H87832 Hs.7388 
AW966158 Hs.58582 
R55241 Hs.101214 
R59585 Hs.101255. 
W07701 Hs.304177 
BE561430 Hs.239388 
R67567 Hs.107110 
NM_004455Hs.150956 
R73588 Hs.101533 
A1970003 Hs.23756 
H75964 Hs.107815 
R93622 hte.12163 
AF068846 Hs.1 03804 
AI766561 Hs.27774 
AW963279 Hs.123373 
T03170 Hs.100165 
AW964237 Hs.6728 
U55184 Hs.1 54145 
BE383768 Hs.65238 . 
R43504 Hs.6181 
AW960782 Hs.6856 
AW91121 Hs.246218 
T40841 Hs.98681 
BE313210 Hs.334798 
T50116 

NM_004477Hs.203772 
T58615 Hs.235887 
AW1 18826 Hs.6093 
AK0Q2126 Hs.11260 
T91251 

T64924 Hs.303046 
T79136 Hs.127243 
T68875 

T69027 Hs^69481 



AIQ84813 Hs.1 14088 
AI873257 Hs.7994 
AW970209 Hs.1 11805 
T85104 Hs.222779 
T80622 Hs.268601 



EST 
ESTs 

ceD division cycle 42 (GTP-bhding prot 
ESTs, Weakry similar to 138022 hypoiheti 
gb:za11c01.s1 Soares fetal Over spleen 
ESTs, Weakry similar to 138022 hypoiheti 
gbza12gQ7,s1 Scares fetal Over spleen 
EST 
ESTs 

protein containing CXXC domain 2 
Homo sapiens cDNA FU1 191 8 fis; done HE 
O-Dnked N^cetytghicosamlne (GfcNAc) tr 
CG147 protein 

regulator of nonsense transcripts 2; DKF 
K1AA0318 protein 

Homo sapiens cDMA FU 10495 Ms, clone NT 
ESTs. Weakry similar to T27173 hypotheti 
alpha-fetoprotein 

gb.ye91c08^1 Scares fetal Over spleen 
ESTs 

anaphase-promoting complex subunit 7 
hypothetical protein MGC12936 
RAB2, member RAS oncogene family-like 
hypothetical protein DKFZp761F2014 
Homo sapiens done 23758 mffilA sequence 
ESTs, Moderately similar to A46010 X-fin 
ESTs 

WAA0328 protein 
ESTs 

early development regulator 2 (homoiog o 
EST 

hypothetical protein MGC3040 
hypothetical protein FU20736 
gb:yg24h04.s1 Soares Infant brain 1NIB H 
gb:yg38g04.s1 Soares infant brain 1NIB H 
ESTs 

kelch prosophfla)-Gke 3 

Homo sapiens cDNA FLJ12789 fis, clone NT 

EST 

ESTs 

Homo sapiens done FLB8503 P R02286 mRNA, 
Human DNA sequence from done RP1-304B14 
ESTs 

exostoses (muttIp!e)-Cke 1 
ESTs 

hypofoetica! protein similar to swine ac 
ESTs 

eukaryotic translation kif&afion factor 
heterogeneous nuclear ribonucleoprotein 
ESTs, Highly simflar to AF161349 1 HSPC0 
ESTs, Weakry similar to ALU1_HUMAN ALU S 
ESTs 

WAA1548 protein 

rrypoftefical protein FU11585 

95 kDa retinoblastoma protein binding or 

ESTs 

ash2 (absent small, or homeofo, Orosop 
Homo sapiens cDNA: FU21781 fis, done H 
ESTs 

eukaryouc translation elongation factor 
gteyb77c10.s1 Stratagene ovary (937217) 
FSHD region gene 1 
ESTs 

Homo sapiens cDNA: FU22783 fs, clone K 
hypothetical protein FU11264 
gbydSOa10^1 Soares fetal Over spleen 
ESTs 

Homo sapiens mRNA for KIAA1724 protein, 
gb:yc30i05.s1 Stratagene liver (937224) 
sex comb on rralleg homoiog 1 
gteyc19d03.r1 Stratagene lung (937210) H 
ESTs 

hypotheficai protein FU20551 
ESTs 

ESTs, Moderately similar to similar to N 
ESTs, Weakry similar to envelope [H.sapi 
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125053 T85352 


T85352 




125064 T85373 


T85373 




125066 T86284 


T86284 




416507 T89579 


AL045364 


Hs.79353 


125080 T90360 


T90360 


Hs.258620 


125097 T94328 


AW576389 Hs.335774 


125104 T9S590 


T955S0 




135107 T97257 


T97257 


Hs.94550 


423122 T97599 


AA845462 Hs.124024 


125118 TQ7R20 


R10606 


HS253890 




T97775 


Hs.100717 


1 341ft) TQ81*>2 


T98152 


Hs.79432 


125138 \Att147Q 


AW962364 Hs.129051 


125144 W37Q9Q 

it*/ inn no/ 333 


AB037742 Hs.24336 


12>150 W38240 


W38240 




450142 W40150 


A W207469 Hs2m5 


131987 W45435 


AW453069 


Hs3657 


125178 W582Q2 


W93127 


HsJ1845 


125180 W58344 


W58469 


Hs.103120 


125182 W58650 


AA451755 


Hs.263560 


446888 W68736 


AL030995 


Hs.16411 


125197 W691GS 


AF086270 


Hs.276554 


133497 W69111 


BE617303 


Hs.74266 


*r£S3i3CC V V Oil 039 


Z97630 


Hs.226117 


129232 WfiQAW 


R98881 


Hs.109655 


4221 66 W79424 


VV72424 


Hs.1 12405 


125209 W7972A 


W72724 


Hs.103174 


195912 W79A34 


AA746225 


Hs.103173 


456831 WT3955 


BE383436 


Hs.108847 


125223 W74701 


AIS16269 


Hs.109057 


125225 W76540 


W74169 


Hs.16492 


125228 W79397 


AA033982 


Hs.110059 


132393 W85B88 


AL135094 


Hs.47334 


125238 W86038 


N99713 


Hs.109514 


125247 W86881 


AA694191 


Hs.163914 




AI051967 


Hs.110122 


125263 W88Q49 


AA098678 






W90022 


Hs.166809 


450862 W92272 


U91543 


Ks.25601 


/co/ni WQ97R4 


NWL007115HS.29352 


428243 WQ3Qdn 


H05317 


Hs.283549 


125277 WQ3227 


W93227 


Hs.103245 


125278 W93523 


AI218439 


Hs.129998 


125280 W9365Q 


AI123705 


Hs.106932 


448205 W94003 


W93949 


Hs.33245 


131844 W94401 


AI419294 


Hs.324342 


125284 W94688 

i fawn v v jiuuu 


NIVL002566H3.103253 


417111 WQ4787 

*t II III VVJ/HIO/ 


AW016321 Hs.82306 


445424 Z38294 


AB028945 Hs.12696 


125289 73R311 


T34530 


Hs.4210 


A46313 7384M 

"riW IO t-OCrrUj 


K06245 


Hs.106801 




AW971018 Hs.21659 


433997 rvwu 


AB040923 Hs.106808 




AB037715 


Hs.183639 


424624 7387R3 


AB032947 


Hs.151301 




AB022317 


Hs.25887 


125298 7309*55 


AW972542 Hs.289008 


125300 Z39591 


Z39591 


Hs.101376 


448378 Z39783 


BE622770 Hs.264915 


444582 Z39920 


R55344 


Hs.22142 


130862 Z40166 


AA497044 


Hs.20867 


128888 Z40388 


AI760853 


Hs.241558 


125310 Z40646 


R59161 


Hs.124953 


125315 Z41697 


R38110 


Hs.106296 


125317 Z99349 


Z99348 


Hs.112461 


135096 Z99394 


AA081258 





gb:yd82d01.s1 Scares fetal Over spleen 
gb:ydB207.s1 Soares fetal Over spleen 
gb:ycJ77b07.s1 Soares fetal Over spleen 
transcription factor Dp-1 
ESTs, Highly similar to ALU6_HUMAN ALU S 
EST, Moderatery similar to SS5557 aipha- 
gb.ye40a03.s1 Soares fetal liver spleen 
ESTs, Moderately similar to 138022 hypot 
dettex (Drosophlla) homobg 1 
gb.-yf35f11.s1 Soares fetal Over spleen 
EST 

fibrflSn 2 (congenital corrtractural ara 
ESTs 

KIAA1321 protein 

Empirtcally selected from AFFX single pr 

chondro/tin sulfate proteoglycan 6 (bama 

ac6vtty-<lependent neuroprotective prats 

ESTs 

ESTs 

ESTs 

hypoflietical protein LOC57187 

heterochrornatin-Gke protein 1 

hypothetical protein MGC4251 

H1 histone family, member 0 

sex comb on midteg (DrosophilaHjke 1 

S100 calcium-binding protein A9 (calgran 

ESTs, Weakly similar to TSP3LHUMAN THROW 

ESTs 

hypothetical protein MGC2749 

ESTs, Weakly similar to ALU5JJUMAN ALU S 

DKFZP564G2022 protein 

ESTs. Weakly similar to 138022 hypotheti 

hypometfcal protein FU 14495 

ESTs 

ESTs 

ESTs 

gb:zn45g10.ii Stratagene HeLa cell s3 93 
ESTs. Highly similar to LCT2L.HUMAN LEUKO 
chromodomain heDcase DNA binding protel 
tumor necrosis factor, alpha-induced pro 
ESTs 
EST 

enhancer of porycomb 1 

ESTs 

ESTs 

ESTs 

periQpin 

destrin (acfin deporymerizing factor) 
cortactin SH3 domain-bincilng protein 
Homo sapiens cONA FU13Q69 fe, done NT 
ESTs, Weakly similar to PC4259 ferritin 
ESTs 

kelch (DrosophllaHIke 1 
hypometical protein FU10210 
Ca2-Klependent activator protein for seer 
sema domain, Immunoglobulin domain (lg), 
Homo sapiens cONA: FU21814 fc, done H 
EST 

Homo sapiens cONA FU12908 fis. done r>fT 

cytochrome b5 reductase b5R.2 

hypomeficai protein FU10392 

ariadne prosophila) homo tog 2 

ESTs 

ESTs 

ESTs, Weakly similar to 138022 hypotheti 
zinc finger protein 36 (KQX 18) 
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Table 3A shows the accession numbers for those pkeys lacking unigeneID*s for Table 3. The pkeys in Table 7 lacking urtigenelD's are represented wfthin 
Tables 1 -6A. For each pro beset we have listed the gene duster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwist Oakland California). The Genbank accession numbers for sequences comprising each duster are feted in the "Accession" 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey 



CAT Number Accession 



108469 116761J AA079487 AA128547 AA128291 AA079587 AAQ796G0 
124106 1 25446.1 H12245 AA094769 R14576 
108501 13884.-12 AA083256 

108562 36375J AA 100796 AF020589 AA074629 AA075946 AA100849 AA085347 AA126309 AA079311 AA079323 AA085274 

101300 4669.1 BE535511 M62093 AA306787 AW891766 AA348998 AA338869 AA344013 AW956561 AW389343 AW403607 L40391 

AW408435 AA121738 AI568978 H13317 R20373 AW948724 AW948744 AA335023 AA436722 AA448690 C21404 
AW884390 AA345454 AA303292 AA174174 BE092290 T90614 AA035104 R76028 AA126924 AA741086 AW022056 
AW1 18940 AA121666 AI832409 AA683475 A1140901 AI623576 AW519064 AW474125 AI953923 AI735349 AW150109 
AI436154 AW1 18130 AW270782 AI804073 N27434 AA876543 AA937815 AI051166 AA505378 AI041975 A1335355 
AI089540 AA662243 AI127912 AI925604 AI250880 A136S874 AI564386 AI815196 AI583526 AI435885 AI160934 H79030 
AI801493 AA448691 AI673767 AC076042 AI804327 AA813438 AA680002 AI274492 T16177 AI287337 AI935050 
AA907805 AA91 1493 AI589411 AI371358 AW576236 A1078866 AW516168 AA346372 A1560185 AA471009 R75857 
AA296025 AA523155 AA853168 AI696593 AI658482 A1566601 AW072797 AA128047 AA035502 AW243274 AA992517 
R43760 
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120409 genbanlLAA235050 AA235050 

120745 genbank_AA302809 AA302809 

120809 genbanK_AA346495 AA346495 

120839 genbanKuAA348913 AA348913 

113702 genbanKJ97307 T97307 

106596 304084.1 AI583948 AA578212 AW303715 AA653450 AA456981 AI400385 W88533 AI224133 AW272145 AA088686 R94698 

1 13947 genbanlLW84768 W84768 

122562 genbanU\A452156 AA452156 

122635 genbaniLAA454085 AA454085 

108277 genbanLAA064859 AA064859 

108403 genbankJ\A075374 AA075374 

122860 genbank_M464414 AA464414 

108427 genban)LAA07S382 AA076382 

108439 genbanU\A078986 AA078986 

131353 231290J AW41 1259 H23555 AW015049 AI684275 AW015886 AW068953 AW014085 AI027260 R52686 AA918278 A1129462 

AA959360 N34869 A1948416 AA534205 AA702483 AA705292 

108533 genban!ejAA084415 AA084415 

124254 genbank_H69899 M69899 

101447 entre^.M21305 M21305 

101458 entreUi/122092 M22092 

101657 13349.1 NM.005381 M60858 AW373732 AW373724 AW373689 AW373629 AW373609 AW373776 AA1B7806 AW386946 



AW374207 T05235 AA216203 AW385556 AA306940 AA306526 AA315461 AL036757 AW37371 1 AW403124 AW403640 
AW377084 727360 H62638 F06957 AW377051 AA554779 AA378568 AA096007 AW352407 AW302637 F07929 H17433 
AW382712 H05665 F07292 N39875 AA089729 H62556 N42842 R12952 AW373735 AW364155 AA056183 W39185 
AW382708 N32488 AF1 14096 AW375993 AI133569 W52561 AA503040 AA133710 AI92B796 AW176370 AA827519 
AW338437 AA521142T29341 AI800461 AW317002 AA703914 AA860830 AI859203 AI445772 AA714334 AI817066 
AI832027 AW510442 AI635802 AW088306 AW068672 AW408555 AW467542 AA552657 AA152367 W32081 AA582124 
AA074040 AA931657 AJ051154 AW41Q203 AI921644 H17434 AI832330 AW404836 AI925038 AA08B423 AA954166 
AA580453 AW021 292 AG67215 AW080082 AW383778 AI933053 AI919097 W31557 N90245 AA931591 AA563995 
F35352 AA056184 AA476294 AA641327 AA533550 AI749630 W5B323 AA569119 AA508573 AI809050 A1378996 
AA411362 AW407505 AA938104 AA074041 AA632876 AW1 93748 AA507873 A1270128 AI472365 AA411363 AI523216 
AI719965 AJ816302 AA182661 AI707990 AA1 33588 AI758537 W60253 A1460308 AA135423 AI083904 F04186 N89693 
AW408776 A1678595 AI270568 AA722059 W58234 F33650 AA090547 AA285108 AA425981 N85079 D20218 A1273980 
AA159028 F03226 AW247914 N26918 AW272741 N90109 H05666 N23327 AW247953 R44748 AA962015 F03558 
AI752394 AW409913 AW248396 AI816463 AI752393 AA325370 AA263089 AI570130 AI971951 AI160658 AI357360 
AW168686 AL121075 AW050536 N21672 W67748 AA514242 A1127386 H14607 AI185752 W79364 AA088520 AA152476 
AW351940 AW373683 AB40524 AW374953 T56500 N24329 A1940720 AW374933 AW374947 AW391913 AL1 38337 
AW376241 AW062943 F26666 AW410202 AW062958 F34529 AW381807 AW393315 W17147 AW176359 AA664576 
AW380424 AA306040 AI745674 AW300951 AI188579 AI438973 A1305271 AA433818 AA612807 AIB31809 AI940409 
AA158663AI572988 



124720 144582.1 R05283 R11056 

124793 genbankJ*44519R44519 

124799 genbankJW50$8R45088 

103138 entre^X65965 X65955 

117683 genbanlLN40180 N40180 

124991 genbanU30116 T50116 

103432 entrez X97748 X97748 

119174 genbankj*71234 R71234 

119239 95573J T11483T11472 

133678 ' 11235J AW247252AA346143NM_000270AA3B1085 N91995X00737AA381079 AA296473AA296110 AA315735AA311617 



AA326750 AA376804 AW403290 T95231 M13953 T47963 H82039 AA279899 AA627997 N76320 N99527 H37842 
W20095 AA457308 AW469547 AA724143 H83220 AA319496 W86334 W30892 R89169 R99427 N41854 H47286 
AA348094 AA045089 R63016 AI922219 AI024906 AJ096488 AI885005 AA194872 N90489 AI452544 H72411 AA282427 
AA430735 R68963 R22453 H70385 AW129369 AW467320 AW519082 AA345018 AA582183 AI9S1769 R65918 N30611 
AI979189 AC80889 AW273191 R66531 AC85845 AT675927 AI421990 AW190879 H37794 AA699667 H68427 AA954388 
AI188757 AI140048 AA430382 A1204151 AW247864 AA559099 AM31420 AA548276 AI149466 AA772569 AA694388 
AA724168 AA301651 AA281952 AA779925 AA234760 W86290 AA913603 AW511745 A1500697 AA814922 AA835040 



124576 
108931 
108941 



genbanK_N682Q1 N68201 
genbanK_M147186 
genbanK_AA146650 



AA147186 
AA148650 
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PCT/US02/04915 



T47964 H53998 AA975804 R98710 AI077504 N70252 R98084 AW250171 H59268 A1597614 AA970746 AA972548 
A13771 16 R62982 H16737 R8907O M731329 R65532 N54354 A1318832 H81944 N71567 T95122 W86463 AA437095 
AI431999 A015724 N63851 AI874743 AA457307 AA21 1475 N64444 AT799146 H72853 R39335 H60413 AA770367 
AA156105 AI269937 H64029 H89728 R65819 AW470496 AI873318 AI735713 H82987 C02447 AI478666 T27651 
AI699770 AW025156 H69719 AI984717 N69225 AI459856 AA953577 AW24691 H13843 R22404 AB73798 AJ336002 
N70898 AI420854 AA541792 AA346142 A1000814 AI828348 AA045090 T51257 N90434 H 13850 N73184 AI708083 
AA78160S AA329050 AA339985 R68964 H64795 W04186 H16845 



119415 


genbanK.T97186 T97186 




119558 


NOTJ=OUND„enfrKLW38194 


W38194 


119559 


NOT^OUTO_entra^W38197 


W38197 


119654 


genbanLW57759W57759 




121350 


genbanLAA405237 


AA405237 


121558 


genbanK_M412497 


AA412497 


105985 


genbanLAA406610 


AA4QS610 


114648 


genbanLAA101056 


AA101056 


121895 


genbank_AA427396 


AA427396 


100327 


entre^D55640 D55640 




123315 


714071 J AA496369AA496646 


123473 


genbankJW599143 


AA599143 
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TABLE 4: 



Pkey: Unique Eos probeset Identifier number 

Accession: Accession number used for previous patent filings 

ExAcca Exemplar Accession number, Genbank accession number 

UnlgeneiD: Unigene number 

Un^ene Tffla* Unigene gene title 



Pkey Accession ExAccn UniGene UnlgeneTftle 



100405 
100420 
100461 
100484 
10071B 
100991 
101097 
101168 
101194 
101261 
101345 
101447 
101485 
101543 
101550 
101560 
101674 
101714 
101741 
101838 
101857 
102012 
102024 
102164 
102241 
102283 
102303 
102564 
102663 
102759 
102778 
102804 
102887 
102898 
102915 
103036 
103037 
103095 
103158 
103166 
103185 
103280 
103554 
103850 
104465 
104592 
104764 
104786 
104850 
104865 
104894 
104952 
104974 
105178 
105263 
105330 
105376 
105729 
105826 
105977 
108008 
106031 
106124 



D85425 
D86983 

HG1098-HT1098 

HG11(»tfT1103 

HG3342-HT3519 

J03764 

106797 

L15388 

L20971 

L35545 

L76380 

M21305 

M24736 

M31166 

M31551 

M32334 

M61916 

M5B874 

M74719 

M92934 

M94856 

U03057 

U03877 

U18300 

U27109 

U31384 

U33053 

U59423 

U70322 

U81607 

U83463 

U89942 

X04729 

X06256 

X07820 

X54925 

X54936 

X60957 

X67235 

X67951 

X69910 

X79981 

Z18951 

AA187101 

N24990 

R81003 

AA025351 

AA027168 

AA040465 

AA045136 

AA054087 

AA071089 

AA085918 

AA1B7490 

AA227926 

AA234743 

AA236559 

AA292694 

AA398243 

AA406363 

AA411465 

AA412284 

AA423987 



AW291587 Hs.82733 
D859S3 Hs. 11 8893 
X70377 Hs.121489 
NNt005402Hs288757 
BE295928 Hs.75424 
J03836 Hs.82085 
BE245301 Hs.89414 
t^L005308Hsi11569 
L20971 Hs.188 
D30857 Hs.82353 
NM.005795HS.152175 
M21305 

AA298520 Hs.89546 
M31166 H&2050 
Y00630 Hs.75716 
AW958272 Hs.347326 
NM_002291Hs.82124 
M68874 Hs.211587 
NM_003199Hs.326198 
BE243845 Hs.75511 
BE550723 Hs.153179 
BE259035 Hs.1 18400 
AA301867 Hs.76224 
NfVL000107Hs.77602 
NM_007351Hs268107 
AW161552 Hs.83381 
U33053 Hs.2499 
U59423 Hs.79057 
NM_002270Hs.168075 
KM.005100Hs.788 
AF000652 Hs.8180 
NWL002316HS.83354 
J03836 Hs.82085 
NM_002205Hs.149609 
X07820 Hs2258 
M13509 Hs.83169 
BE01B302 Ks.2894 
NMJK>5424Hs.78824 
BE242587 Hs.1 18651 
AA159248 Hs.180909 
NM.006825HS.74368 
U84722 Hs.76206 
AI878826 Hs.74034 
AA187101 Hs.213194 
Z44203 Hs.26418 
AW630488 Hs25338 
AJ039243 Hs.278585 
AA027167 Hs.10031 
AL133035 Hs£728 
T79340 Hs^2575 
AF065214 Hs.18858 
AW076098 HS.345588 
Y12059 Hs-278675 
AA313825 Hs^1941 
AW388633 Hs.6682 
AW338625 Hs^2120 
AW994032 Hs3768 
H46612 HS293815 
AA478756 Hs. 194477 
AKD01972 Hs.30822 
AB033888 Hs.8619 
X64116 Hs.171844 
H93366 Hs.7567 



nidogen2 

Melanoma associated gene 
cystatin D 

v-ra) simian leukemia viral oncogene horn 
inhibitor of DNA binding 1, dominant neg 
serine (or cysteine) proteinase inhibito 
chemokine (C-X-C mofif), receptor 4 (fus 
G protein-coupled receptor kinase 5 
phosphodiesterase 4B, cAMP-specific (dun 
protein C receptor, endothelial (EPCR) 
calcitonin receptor-iike 
gb:Human alpha satellite and satellite 3 
se lectin E (endothelial adhesion molecu) 
pentaxbwelated gene, rapidly induced b 
serine (or cysteine) proteinase inhibito 
intercellular adhesion molecule 2 
larrnnin, beta 1 

phosphoiipase A2, group IVA (cytosolte, 
transcription factor 4 
connective tissue growth factor 
fatty acid binding protein 5 (psoriasis- 
singed (DrosophilaJ-Oke (sea urchin fas, 
EGF-containing fibuiln-fike extracelluia 



multimerin 

guanine nucleotide binding protein 11 

protein kinase C-Bke1 

MAD (mofrers against decapentapteglc, Dr 

karyopherin (importin) beta 2 

A kinase (PRKA) anchor protein (gravin) 

syndecan binding protein (syntenin) 

tysyt oxidase-Cke 2 

serine (or cysteine) proteinase inhibito 
integral, alpha 5 (fibronecfin receptor, 
matrix rneteltoproteinase 10 (stromelysin 
matrix metaibproteinase 1 (interstitial 
placenta] growth factor, vascular endoth 
tyrosine kinase with immunoglobulin and 
hematopoieticaily expressed homeobox 
peroxiredaxin 1 

transmembrane protein (63kD), endopiasmi 
cadnerin 5, type 2, VE-cadherin (vascuta 
caveofln 1, caveolae protein, 22kD 
hypothetical protein MGC10895 
ESTs 

protease, serine, 23 
ESTs 

KIAA0955 protein 

hypothetical protein DKFZp434G171 
B-ceO CLL/lymphoma 6, member B (zinc fi 
phosphoiipase A2, group IVC (cytosoiic, 
desmoplakin(DPl.DPIl) 
bromoctornain-antaining 4 
AD036 protein 

solute carrier famDy 7, (cationic amino 
ESTs 

hypothetical protein FU10849 

Homo sapiens HSPC285 mRNA. partial cds 

E3 ublouffin Ogasa SMURF2 

hypoSieticai protein RJ11110 

SRY (sex determining region Y)-box 18 

HornosapieriscDNA;FU22296fis,ctoneH 

Homo sapiens cDNA: HJ21962fis,doneH 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



105155 
106302 
108423 
105793 
107174 
107216 
107295 
107385 
103756 
108846 



109001 
109166 
109456 
109768 
110107 
110906 
1109B4 
111006 
111018 
111133 
111760 
113073 
113195 
113923 
114521 
115061 
115096 
115145 
115819 
115947 
116314 
116339 
116430 
116589 
116733 
117023 
117186 
117563 
117997 
118475 
118581 
119073 
119155 
119174 
119221 
119416 
119866 
121335 
121381 
123160 
123473 
123523 
123533 
123964 
124006 
124315 
124659 



AA425309 

AA435898 

AA448238 

AA478778 

AA621714 

051069 

T34527 

U97519 

AA127221 

AA132983 

AA135606 

AA156125 

AA179845 

AA232645 

F10399 

H16772 

N39584 

N52006 

N53375 

N54067 

N64436 

R26892 

733637 

T57112 

W80763 

AA046808 

AA253217 

AA255991 

AA258138 

AA426573 

AA443793 

AA490588 

AA496257 

AA609717 

D59570 

F13787 

H88157 



124847 
124875 
125091 
.125103 
125355 
125565 
125590 
423765 
126511 
100286 
126563 
126649 
449602 
126872 
456000 
414221 
127402 



N34287 

N52090 

N66845 

N 58905 

R32894 

R61715 

R71234 

R98105 

T97186 

W80814 

AA404418 

AA405747 

AA488687 

AA599143 

AA608588 

AA608751 

C13981 

D60302 

H94892 

N93521 

N95477 

R60044 

R70506 

T91518 

T95333 

R45630 

R20839 ' 

R23858 

R23858 

AI024874 

W26247 

W26247 

AA856990 

AA856990 

AA136553 

AA136653 

AA136653 

AA358869 



AA425414 Hs33287 
AA398859 Hs.1 8397 
AB020722 Hs.1 6714 
H94997 Hs.16450 
BE122762 Hs.25338 
D51069 Hs.211579 
AA186629 Hs.80120 
NNL005397HS.16426 
AA127221 Hs.1 17037 
AL117452 Hs.44155 
AA135606 Hs.189384 
AI056548 Hs.72116 
AA219691 Hs.73625 
AW956580 Hs.42699 
F06838 Hs.14763 
AW151660 Hs.31444 
AA035211 Hs.17404 
AW613287 Hs.80120 
BE387014 Hs.166146 
AI287912 Hs.3628 
AW580939 Hs.97199 
BE551929 Hs.268754 
N39342 Hs.103042 
H83265 Hs.8881 
AW953484 Hs.3849 
AW139036 Hs.108957 
AI751438 Hs.41271 
AJ683069 Hs.175319 
AA740907 Hs.88297 
AA486620 Hs.41135 
R47479 Hs.94761 
AI799104 Hs.178705 
AK000290 Hs.44033 
AK001531 Hs.66048 
A1557212 Hs.17132 
AL157424 Hs.61289 
AW070211 Hs.1 02415 
H98988 Hs.42612 
AF055634 Hs.44553 
N52090 Hs.47420 
N66845 
N68905 

BE245360 Hs.279477 
R61715 Hs.310598 
R71234 

C14322 Hs.250700 
T97186 

AA496205 Hs.1 93700 
AA404418 

AW088642 Hs.97984 

AA488687 Hs.284235 

AA599143 

AA608588 

AA508751 

C13961 

AI147155 Hs.270016 
NNLG05402HS.288757 
AI680737 Hs.289068 
AI571594 Hs.102943 
W07701 Hs.304177 
A1887664 Hs.285814 
T91518 

AA570056 Hs.1 22730 
R60547 Hs.170098 
R20840 

R23858 Hs.143375 
R23858 Hs.143375 
T92143 Hs.57958 
BE247550 Hs.86859 
AA516391 Hs.1 81358 
AA001860 Hs.279531 
AA001860 Hs.279531 
AW450979 
BE180876 Hs.11614 
AW450979 

AA358869 Hs.227949 



nudear factor l/B 

hypothetical protein FU23221 

Rho guanine exchange factor (GEF) 15 

ESTs 

ESTs 

melanoma cell adhesion molecule 

UDP^-acety^tpha^alactc^ne^otyp 

podccaryxMa 

ESTs 

DKFZP586G1517 protein 

gb2l10a05^1 Soares _pregnantuterus_NbH 

hypoflieucal protein FU20992 similar to 

RAB6 interacfing, kmesm-Gke (rabkines 

ESTs 

ESTs 

ESTs 

ESTs 

UDP^^c«tyt^lpha^^a!actosarnine:po!yp 
Homer, neuronal Immediate earty gene, 3 



complement component C1q receptor 
Homo sapiens cDNA FU1 1949 6s, done HE 
rnicrotubu la-associated protein 1B 
EST s, Weakly similar to S41044 chromosom 
hypofoeiicaJ proteh FU 22041 similar to 
40S ribosomal protein S27 isoform 
Homo sapiens mRNA fuB length insert cDN 
ESTs 
ESTs 

endomudn-2 
K1AA1691 protein 

Homo sapiens cDNA FU1 1333 fs, done PL 

cfipeptidyf peptidase 8 

hypothetical protein RJ 10669 

EST s, Moderately similar to I54374 gene 

syriaptojanin 2 

Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
ESTs, WeaWy similar to ALU1_HUMAN ALU S 
unc5 (C.elegans homolog) c 
EST 

gbaa46c1 U1 Soares fetal fiver spleen 
gbza69b09.s1 SoaresJetaLrongJtoHL19W 
ESTs 

ESTs, Moderately similar to ALU1.HUMAN A 
gb:yi54c08.s1 Soares placenta Nb2HP Homo 
tryptase beta 1 

gbye50h09^1 Soares fetal Over spleen 
Homo sapiens mRNA; cDNA DKFZp586I0324 (f 
gbzw37e02s1 SoaresJotaLfetus,.Nb2HF8_ 
hypottiefical protein FU22252 sim2ar to 
ESTs, WeaWy similar to 138022 hypothefi 
gb:ae52d04.s1 Stratagene lung carcinoma 
gb:ae54e06.s1 Stratagene lung carcinoma 
gbae56h07.s1 Stratagene lung carcinoma 
gb:C13981 Clontech human aorta polyA+ mR 
ESTs 

v-ral simian leukema viral oncogene horn 

Homo sapiens cDNA FU11918 fe, done HE 

hypothefical protein MGC12916 

Homo sapiens done FLB8503 PR02286 mRNA, 

sprouty (Drosophila) homolog 4 

gb:ye20f05.s1 Stratagene lung (937210) H 

ESTs, Moderately simSar to K1AA1215 pro 

K1AA0372 gene product 

gb7g05c08 Jl Soares infant brain 1 NIB H 

Homo sapiens, done IMAGE3840937, mRNA, 

Homo sapiens, done IMAGE:3840937, mRNA, 

EGFTW-Jatroph5rwelated protein 

growth factor receptor-bound protein 7 

U5 snRNP-specific protein (220 kD), orth 

ESTs 

ESTs 

gb:Lr^H-B]3-aia-a-12-CUJU1 NCLCGAP_Su 
HSPC065 protein 

gb:UI-H-Bl3^la*12-0-ULs1 NCLCGAP.Su 
SEC13(S.cerevisiae)^ke1 
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127651 
424806 
128062 
128992 
129046 
129188 
129314 
129371 
129468 
129765 
129805 
129884 
130495 
130639 
130657 
130828 
130972 
131080 
131137 
131182 
131486 
131573 
131647 
131756 
131859 
131881 
132050 
132083 
132164 
132358 
132413 
132456 
132490 
132676 
132687 
132718 
132736 
132760 
132933 
132968 
132994 
133061 
133147 
133161 
133200 
133260 
133363 
133491 
133517 
133550 
133607 
133614 
133627 
133691 
133696 
133913 
133975 
133985 
134039 
134088 
134161 
134299 
134416 
116470 
134656 
134989 
135051 
135073 
135349 
1001 14 
100130 
100143 
100168 
100208 
100224 
10)405 



AI123976 

AI123976 

AA379500 

R49693 

AA195678 

M30257 

AA028131 

M10321 

J03040 

MB6933 

AA012933 

AA286710 

AA243278 

D59711 

T94452 

AA053400 

AA370302 

J05008 

U85193 

AA256153 

X83107 

AA046593 

AA410480 

D45304 

M90657 

AA010163 

AA136353 

Y07867 

U84573 

X60486 

AA132969 

AA114250 

F13782 

AA283035 

AB002301 

AA056731 

U68019 

H99198 

AA598702 

N77151 

AA505133 

AB000584 

D12763 

AA253193 

AA432248 

AA083572 

AA479713 

L40395 

X52947 

W80846 

M34539 

D67029 

U09587 

M85289 

D10522 

W84712 



L34657 

S78569 

D43636 

U9718B 

AA487558 

M28882 

X70683 

X14787 

AA236324 

C15324 

AA452000 

D83174 

D00596 

D11428 

D13640 

D14874 

026129 

D28476 

D86425 



AA382523 Hs.105689 
AA382523 Hs.105689 
AA379621 Hs.105547 
H04150 Hs.107708 
AB029290 Hs. 103258 
NMJ»1078Hs,109225 
BE622768 Hs^90356 
X06828 Hs.1 10802 
AVM10536 Ks.111779 
M86933 Hs.1238 
AA012848 Hs. 12570 
AF055581 Hs.13131 
AW250380 Hs.109059 
AI557212 Hs.17132 
AW337575 Hs.201591 
AW631469 Ks.203213 
D81866 Hs^1739 
NM_001955Hs^271 
W27392 H*l33287 
AI824144 HsJ23912 
F06972 H&27372 
AA040311 Hs.28959 
AA359515 Hs.30089 
AA443966 Hs.31595 
AW960554 
AW361018 Hs.3383 
AI267615 Hs.38022 
BE386490 KsJ279653 
AI752235 Hs.41270 
NM_003542Hs.46423 
AW361383 Hs.260116 
AB011084 Hs.48924 
NM.001290Hs.4980 
N92589 Hs.261038 
AB002301 Hs.54985 
NM_004600Hs.554 
AW081883 Hs.211578 
AA125985 Hs.56145 
BE263252 Hs.6101 
AF234532 Hs.61638 
AA1 12748 H&279905 
AI186431 Hs.296638 
AA026533 Hs.66 
AW021103 Hs.6631 
AB037715 Hs.183639 
AA403045 Hs.6906 
AI866286 Hs.71962 
BE619053 Hs.1 70001 
NMJM0165HS.74471 
A1129903 Hs.74569 
BE273749 

NMJ»3003Hs.75232 
N^002047Hs.75280 
M85289 HS211573 
A1878921 Hs.75607 
AU076964 Hs.7753 
C18356 Hs.295944 
L34657 Hs.78146 
NM_002290HsJ8672 
AI379954 Hs.79025 
AA634543 Hs.79440 
AW580939 Hs.97199 
X68264 Hs.211579 
AI272141 HS33484 
AI750878 Hs.87409 
AW968058 Hs.92381 
AI272141 Hs.83484 
W55956 Hs.94030 
AA1 14212 Hs.9930 
X02308 Hs£2962 
NM_000304Hs.1O3724 
AU076465 Hs278441 
H73444 Hs394 
NM.002933HS.78224 
AL121516 Hs.138617 
AW291587 Hs.82733 



MSTP031 protein 
MSTP031 protein 

neural profiferafion, differentiation an 
ESTs 

acfin binding protein; macrophin (mlcrof 
vascular ceD adhesion molecule 1 
mesoderm development candidate 1 
von Wfflebrand factor 
secreted protein, acidic cystetoe-rich 
ametogenin (Y chromosome) 
tubuBn-specirlc chaperone d 
lysosomal 

mitochondrial ribosomaJ protein L12 
ESTs, Moderately similar to (54374 gene 
ESTs 
ESTs 

Homo sapiens mRNA; cDNA DKFZp586l1518 (f 
endothe&i 1 
nuclear factor l/B 
ESTs 

BMX riorweceptortyros&ie kinase 

ESTs 

ESTs 

ESTs 

transmembrane 4 supertarnfly member 1 
upstream regulatory element binding prot 
ESTs 
Pirin 

procollagen-lysine, 2-oxogtuiarate 5-dio 
H4 rustone family! member G 
metaHoprotease 1 (pitrflysin family) 
WAA0512 gene product; ALEX2 
UM domain binding 2 
ESTs, Weakly similar to 138022 hypotheti 
KIAA0303 protein 

Sjogren syndrome anfigen A2 (60kD, ribon 
Homo sapiens cDNA FU23037 fe, clone L 
thymosin, beta, idenfified in neuroblast 
hypothetical protein MGC3178 
myosin X 

clone HQ0310PROQ310p1 
prostate differentiation factor 
interteukin 1 receptor-like 1 
hypoflietfcal protein FU 20373 
hypofoetical protein FU10210 
Homo sapiens cDNA: FU23197 fis, done R 
EST s, Weakly similar to B36298 proGne-r 
eukaryotic translation initiation factor 
gap junction protein, alpha 1 , 43kO (con 
vesicle-associated membrane protein 5 (m 
FK506-binding protein 1A (12kD) 
SEC14 (S. cerevisfaeHike 1 
glycyHRNA synthetase 
heparan sulfate proteoglycan 2 (periecan 
mynstoylated afanine-rich protein kinas 
calumenin 

tissue factor pathway inhibitor 2 ' 
platetet/endotheBal ceD adhesion motec 
lamlntn, alpha 4 
KIAA0096 protein 
IGNI mRNA-binding protein 3 
complement component C1q receptor 
melanoma ceD adhesion molecule 
SRY (sex determining region Y>bax 4 
thrombospondin 1 

nudix (nucleoside diphosphate linked moi 
SRY (sex determining region Y)-box 4 
Homo sapiens mRNA; cDNA DKF2p586E1624 (f 
serine (or cysteine) proteinase inhtbito 
thymidylate synthetase 
peripheral myelin protein 22 
WAA0015gene product 
adrenomeduIHn 

ribonudease, RNase A family, 1 (pancrea 
thyroid hormone receptor interactor 12 
nidogen2 
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100420 
10)455 
100529 
100618 
5 100619 
100658 
100676 
100718 
100752 

10 100828 
100850 
1X991 
101097 
101110 

15 101142 
101156 
101168 
101184 
101192 

20 101317 
101336 
101345 
101400 
101475 

25 101485 
101496 
101505 
101543 
101557 

30 101560 
101587 
101592 
101633 
101634 

35 101667 
101682 
101714 
101720 
101741 

40 101744 
101793 
101837 
101838 
101840 

45 101857 
101864 
101931 
101966 
102012 

50 102013 
102024 
102059 
102121 
102283 

55 102300 
102378 
102395 



102491 

60 102499 
102523 
102560 
102564 
102589 

65 102600 
102645 
102687 
102693 
102709 

70 102759 
102804 
102882 
102907 
102915 

75 102927 
102960 



086983 
D87953 

H61862-HT1897 

HG2614-HT2710 

HG2639-HT2735 

HG2855+TT2995 

HG3044-HT3742 

HG3342-HT3519 

HG3543-HT3739 

HG40S9-HT4339 

HG417-HT417 

J03764 

L06797 

L08248 

L12711 

L13977 

L15388 

L19871 

120859 

L42176 

L49169 

L76380 

M15990 

M23254 

M24736 

M26576 

M27398 

M31166 

M31994 

M32334 

M35878 

M36429 

M57730 

M57731 

M60858 

M62994 

M68874 

M69043 

M74719 

M75126 

M84349 

M92843 

M92934 

M93056 

M94856 

M95787 

S76965 

S81914 

U03057 

U03100 

U03877 

U08021 

U14391 

U31384 

U32944 

U40369 

U41767 

U48959 

U51010 

U51478 

U53445 

U59289 

U59423 

U62015 

US3825 

U67963 

U73379 

U73824 

U77604 

U81607 

U89942 

X04412 



X07820 
X12876 
X15729 



086983 Hs.118893 
AW888941 Hs.75789 
8E313693 Hs.334330 
AI752163 Hs.114599 
N24433 Hs^41567 
U56725 Hs.180414 
X02761 Hs.287820 
BE295928 Hs.75424 
T81309 

AL048753 Hs.303649 
AA836472 Hs^97939 
J03836 Hs.82085 
BE245301 Hs*9414 
AI439011 Hs.86386 
L12711 Hs.89643 
AA340987 Hs.75693 
NM_005308Hs^11569 
NIVLG01674HS.460 
BE247295 Hs.78452 
L42176 Hs.8302 
NMJJ06732HS.75678 
NNL005795HS.152175 
M15990 Hs.194148 
BE410405 Hs.76288 
AA296520 Hs.89546 
X12784 Hs.1 19129 
AA307680 Hs.75692 
M31166 Hsl2050 
BE293116 Hs.76392 
AW958272 Hs.347326 
AI752416 HS.77326 
AF064853 Hs.91299 
NM.004428Hs.1624 
AV650262 Hs.75765 
NM.005381 
AF043045 Hs.81008 
M68B74 Hs.211587 
M69043 Hs.81328 
NNL003199Hs326198 
A1879352 Hs.1 18625 
W01076 HS27B57Z 
M92843 Ks.343586 
BE243845 Hs.75511 
AA236291 Hs. 183583 
BE550723 Hs.153179 
BE392588 Hs.75777 
NMJ»6823Hs.75209 
X96438 Hs.76095 
BE259035 Hs.1 18400 
BE616287 Hs.178452 
AA301867 Hs.76224 
AT752666 Hs.76669 
NM.004998HS.82251 
AW161552 Hs.83381 
AJ929721 Hs£120 
AU076887 Hs^8491 
AU077005 Hs.92208 
U48959 Hs^11582 
U51010 

BE243877 Hs.76941 
U53445 Hs.15432 
R97457 Hs.63984 
U59423 Hs.79067 
AU076728 Hs.8857 
AI984144 Hs.66713 
AL119566 Hs.6721 
NM_007019Hs.93002 
AA532780 Hs.183684 
AA122237 Hs.81874 
NM_005100Hs.788 
NM.002318HS.83354 
AI767736 Hs.290070 
BE409861 H3.202833 
X07820 Hs^258 
BE512730 Hs.65114 
AI904738 Hs.76053 



Melanoma associated gene 



caimodufo 2 (phosphorytase kinase, delt 
collagen, type Vlil, alpha 1 
RNA binding motif, single stranded inter 
heat stock 70kD protein 2 
fibronecfin 1 

inhibitor of DNA binding 1, dominant neg 
insufin-Cke growth factor 2 (somatomed) 
smafl inducible cytokine A2 (monocyte ch 



serine (or cysteine) proteinase inhibtto 

chemokine (C-X-C motif), receptor 4 (fus 

myeloid ceO leukemia sequence 1 (BCL2-r 

transketoiase (Wernicke-Korsakoff syndro 

prolytcarboxypeptidase (anglotensinase C 

6 protein-coupled receptor kinase 5 

activating transcription factor 3 

sotute carrier family 20 (phosphate tran 

four and a half UM domains 2 

FBJ murine osteosarcoma viral oncogene h 

calcttonki receptor-tike 

v-yes-1 Yamaguchi sarcoma viral oncogene 

calpaki 2, (m/ll) large subunit 

se lectin E (endothelial adhesion moJecul 

collagen, type IV, alpha 1 

asparagine synthetase 

pentaxin-reiated gene, rapidly Induced b 

aldehyde dehydrogenase 1 family, member 

intercellular adhesion molecule 2 

insuMke growth factor binding prote 

guan'ne nucleotide binding protein (G pr 

ephrin-A1 

GR02 oncogene 

nudeorn 

ffiamin B, beta (actin-binding protein-2 
phospholipase A2, group IVA (cytosoBc, 
nuclear factor of kappa light pofypeptid 
transcription factor 4 
hexokinase 1 

CD59 antigen p18-20 (antigen identified 
zinc finger protein homologous to Zfp-36 
connective tissue growth factor 
serine (or cysteine) proteinase inhibtto 
fatty acid binding protein 5 (psoriasfe- 



protetn kinase (cAMP-dependent cataryti 
immediate early response 3 



catenin (cadhertiKissociated protein), a 
EGF-containing fibuSn-ffice extraceHula 
nicotinamide N-metfiyttransferase 
myosin IE 

guanine nucleotide binding protein 1 1 
dynetn, cytoplasmic Dght polypeptide 
spermidine/spermine N1-acetytiransferase 
a disintegrtn and metalloproteinase doma 
myosin, light polypepfide kinase 
gbiHuman nicotinamide N-methyttransferas 
ATPase, Na^+ transporting, beta 3 poiy 
downregulated in ovarian cancer 1 
cadherin 13, H-cadherin (heart) 
MAD (mothers against decapentaplegic, Or 
cysteine-rich, angiogenic inducer. 61 
hepatitis delta antigen-mteractfng prot 
lysosomal 

ubiquftin carrier prater E2-C 
eukaiyofte translation initiation factor 
microsomal glutathione S-transf erase 2 
A kinase (PRKA) anchor protein (gravin) 
lysyl Qridase-Qce 2 
gefeofin (amyloidosis, Finnish type) 
heme oxygenase (decycfing) 1 
matrix metaibproteinase 10 (stromelysin 
keratin 18 

DEAD/H (Asp^lu-Ala^spVHb) box polypep 



153 



WO 02/079492 



103011 
103020 
103029 
103036 
103056 
103080 
103095 
103138 
103176 
103195 
103347 
103371 
103432 
103471 
103967 
104447 
104764 
104783 
104798 
104865 
104877 



104952 
105113 
105178 
105196 
105215 
105263 
105271 
105330 
105461 
105492 
105493 
105594 
105727 
105732 
105767 
105882 
105936 
106031 
106124 
106222 
106241 
106263 
106264 
106366 
106454 
106634 
106724 
106793 
106799 
106842 
106868 
106890 
106961 
106974 
107030 
107061 
107086 
107216 
107385 
107444 
107985 
108507 
108695 
108931 
109001 
109195 
109390 
109456 
109737 
110411 
110660 
110905 
111018 
111091 



X52541 

X53416 

X54489 

X54925 

X57206 

X59798 

X60957 

X65965 

X69111 

X70940 

X87838 

X91247 

X97748 

Y00815 

AA303711 

L44538 

AA025351 

AA0Z7050 

AA029462 

AA045136 

AA047437 

AA054037 

AA071089 

AA156450 

AA187490 

AA195031 

AA205724 

AA227926 

AA227986 

AA234743 

AA253216 

AA256210 

AA256268 

AA279397 

AA292379 

AA292717 

AA346551 

AA400292 

AA404338 

AM12284 

AA423987 

AA428594 

AA430108 

AA431462 

AA431470 

AA443756 

AA449479 

AA459916 

AA465226 

AA478778 

AA479037 

AA482597 

AA487561 

AA489245 

AA504110 

AA520989 

AA599434 

AA608649 

AA609519 

D51069 

U97519 

W28391 

AA035638 

AA083514 

AA121315 

AA147186 

AA156125 

AA188932 

AA219653 

AA232645 

F10078 

H48032 

HB2117 

N39564 

N54067 

N59858 



AJ243425 Hs.326035 
X53416 Hs.195464 
AW800726 Hs.789 
M13509 Hs.83169 
Y18024 Hs78877 
AU077231 Hs.82932 
NMJ05424HS.78824 
X65965 

AL021154 Hs.76884 
AA351647 Hs.2642 
AU077309 Hs.171271 
X91247 Hs.13046 
X97748 

Y00815 Hs.75216 
AL120051 Hs.144700 
AW204145 Hs.156044 
AI039243 Hs.278585 
AA533513 Hs.93659 
AW952619 Hs.17235 
T79340 Hs.22575 
AI138635 Hs22968 
AF065214 Hs.18858 
AW076098 Hs.345588 
AB037816 Hs.8982 
AA313825 Hs.21941 
W84893 Hs.9305 
AA205759 Hs.10119 
AW388633 Hs.6682 
AA807881 Hs.25329 
AW338625 Hs.22120 
BE539071 Hs.69388 



AI805717 
AL0475B6 



Hs.289112 
Hs.10283 



AB024334 Hs.25001 
AL135159 Hs.20340 
AW504170 Hs.274344 
AW370946 Hs.23457 
W46602 Hs.81988 



AI678765 

X64116 

H93366 



Hs.21812 

Hs.171844 

Hs.7567 



AA356392 Hs.21321 

BE019681 Hs.6019 

W21493 Hs.28329 

AL046859 Hs3407 

AA186715 Hs.336429 
NM.014Q38Hs.5216 

W25491 Hs.288909 



N48670 
H94997 



Hs.28631 
Hs.16450 



BE313412 Hs.7961 
AF124251 Hs.26054 
BE185536 Hs.301183 
AA489245 Hs.88500 
AW243614 Hs.18063 
AI817130 Hs3195 
AL1 17424 Hs.25035 
BE147611 Hs,6354 
NNL012331H&26458 
D51069 Hs^11579 
NM_005397Hs,16426 
W28391 HS343258 



T40064 
A1554545 



Hs.71968 
Hs.68301 



AB029000 Hs.70823 
AA147186 

AI056548 Hs.72116 
AF047033 Hs.132904 
AW007485 Hs.87125 
AW956580 Hs.42699 
AA055415 Hs.13233 
AW001579 Hs.9645 
AA782114 Hs.28043 
AA035211 Hs.17404 
A1287912 Hs.3828 
AA300067 Hs.33032 



early growth response 1 

fflamin A, alpha (acfin-bindirg protein- 

GR01 oncogene (melanoma growth sfimidati 

matrix metaltoproteinase 1 (interstitial 

inositol 1 ,4,54ns phosphate 3-kinase B 

cycfin D1 (PRAD1: parathyroid adenomatos 

tyrosine kinase with immunoglobulin and 

gb:H .sapiens S0O2 gene for manganese su 

inhibitor of ONA binding 3, dominant neg 

eukaryotic translation elongation factor 

catenin (cadherin -associated protein), b 

thloredcorin reductase 1 

gb:H.sapiens PTX3 gene promotor region. 

protein tyrosine phosphatase, receptor t 

ephrln-61 

ESTs 

ESTs 

protein disulfide isomerase related prot 

Homo sapiens done TCCCIA00176 mRNA sequ 

BceQ CLUymphoma 6, member B (zinc fi 

Homo sapiens done IMAGE451939, mRNA se 

phosphofipase A2, group IVC (cytosoBc, 

desmopialdn(DPLDPI]) 

Homo sapiens, done IMAGE: 3506202, mRNA, 

AD036 protein 

angiotensin receptor-lite 1 

hypofltefical protein FU14957 

solute carrier family 7, (cattontc amino 

ESTs 

ESTs 

hypothetical protein RJ20505 . 

CGW3 protein 

RNA binding mo© protein 8B 

tyrosine 3-monooxygenase/tryptophan 5-mo 

KIAA1002 protein 

hypothetical protein MGC12942 

ESTs 

disabled (Drosophita) homotog 2 (mitogen 
ESTs 

Homo sapiens cDNA: FU22296 fis, done H 

Homo sapiens cDNA: FU21962 fis, clone H 

Homo sapiens done FLB9213 PR02474 mRNA, 

Homo sapiens cDNA: FU21288 fis, done C 

hypothetical protein FU 14005 

protein kinase (cAMP-dependent, catatyti 

RIKEN cDNA 9130422N19 gene 

HSPC028 protein 

hypothetical proteh FU22471 

Homo sapiens cDNA: FU22141 fis, clone H 

ESTs 

Homo sapiens clone 25012 mRNA sequence 
novel SH2-contalning protein 3 
motecule possessing ankyrin repeats indu 
rnitogen-adivated protein kinase 8 inter 
Homo sapiens cDNA RJ10768 fis, done NT 
Homo sapiens cDNA FU 13698 fis, done PL 
chloride Intracellular channel 4 
stromal ceS derived factor receptor 1 
methionine sulfoxide reductase A 
melanoma ceO adhesion motecule 
podocaiyxMe 

proDferatbn-associated 2G4, 38kD 

Homo sapiens mRNA; cDNA DKFZp564F053 (fr 

ESTs 

K1AA1077 protein 

gbzo38d01.s1 Stratagene endoMa! eel 
hypothetical protein FU 20992 similar to 
solute carrier famSy 4, sodium bicarbon 
EH-dorrain containing 3 
ESTs 

ESTs, Moderately similar to A47582 B-cel 
Homo sapiens mRNA for KIAA1741 protein, 
ESTs 
ESTs 

mftogen-acfivated protein kinase kinase 
hypothetical protein DKFZp434N185 
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111356 
111378 
111741 
111769 
112318 
112951 
113057 
113195 
113490 
113542 
113803 
113847 
113910 
113947 
114047 
115061 
115819 
115870 
115964 
116228 
116264 
116314 
116589 
117023 
117112 
117156 
117176 
117280 



119866 
12)655 
121314 
121335 
121822 
121835 
122331 
122577 
123160 
123486 
124059 
124339 
124358 
124364 
124726 
124763 
125167 
125304 
125307 
125329 
107985 
125598 
125609 
116024 
418000 
126399 
127435 
127566 
127619 
434190 
128453 
128495 
128515 
128580 
128623 
128642 
128669 
128903 
128914 
129087 
129188 
129226 



129345 
129468 
129488 
101838 



N90933 

N93764 

R26124 

R27957 

R55470 

T16550 

T26674 

T57112 

T88700 

T90527 

W42789 

W60002 

W78175 

W84768 

W94427 

AA253217 

AA426573 

AA432374 

AA446622 

AA478771 

AA482594 

AA490588 

D59570 

H88157 

H94648 

H97538 

H98670 

N22107 

W38197 

W80814 

AA287347 

AA402799 

AA404418 

AA425107 

AA425435 

AA442872 

AA452860 

AA488687 

AA599674 

F13673 

H99093 

N22495 

N23031 • 

R15740 

R39610 

W45560 

Z39833 

Z40583 

AA825437 

R66613 

R66613 

AA868063 

AA128075 

AA128075 

AA128075 

N66570 

AI051390 

AA627122 

AA627122 

X02761 

AF010193 

AA149044 

U82108 

D78676 

L35240 

AA598737 

R69417 

AA232837 

N72695 

M30257 

M96843 

X68277 

AA292440 

J03040 

AA228107 

AA449789 



BE301871 Hs.4867 
AW160993 Hs^26292 
AB020653 Hs.24024 
AW629414 Hs.24230 
AW083384 Hs.11057 
AA307634 Hs.6550 
AW194301 Hs.339283 
H83265 Hs.8881 
BE178110 Hs.173374 
H43374 Hs.7890 
AW880709 Hs^83683 
NNLQ05Q32HS.4114 
AA113262 Hs.17901 
W84768 

AL035858 Hs.3807 
AI751438 Hs.41271 
AA486620 Hs.41135 
NM.0Q5985HS.46029 
AA987568 Hs.74313 
A1767947 Hs.50841 
D51174 Hs.272239 
AI799104 Hs.178705 
AI557212 Hs.17132 
AW070211 Hs.102415 
AW969999 Hs.293658 
W73853 

H45100 Hs.49753 
M18217 Hs.172129 
W38197 

AA496205 Hs.193700 
AA305599 Hs238205 
W07343 Hs.182538 
AA404418 
AI743860 

AB033030 HS.30Q570 
AL133437 Hs.1 10771 
AA829725 Hs.334437 
AA488687 Hs.284235 
BE019072 Hs.334802 
BE387335 Hs.283713 
H99093 Hs.343411 
AW070211 Hs.102415 
AF265555 Hs.250646 
NJVL003654HS.104576 
BE410405 Hs76288 
AL137540 Hs.102541 
AL359573 Hs.124940 
AW580945 H&330466 
AA825437 Hs.58875 
T40064 Hs.71958 
T40064 Hs.71968 
AA868063 Hs.104576 
AA088767 Hs*3883 
AA932794 Hs.83147 
AA088767 Hs.83883 
X69086 HS286161 
AI051390 Hs.116731 
AA627122 Hs, 163787 
AA627122 Hs.163787 
X02761 Hi287820 
NM-005904HS.100602 
BE395085 Hs.1 0086 
U82108 H&101813 
BE076608 Hs.1 05509 
228913 Hs.102948 
W28493 Hs.180414 
AW150717 Hs.345728 
AW857491 Hs.107125 
AI348027 Hs.1 08557 
NM_001078Hs.109225 
BE222494 Hs.180919 
AA530892 Hs.171695 
R22497 Hs.1 10571 
AW410538 Hs.111779 
AW966728 Hs^4642 
BE243845 Hs.75511 



mannosyl (a!pha-1 glycoprotein beta- 
hypolhefca]gBn8DKFZp434A1114 
K1AA0846 protein 
ESTs 

ESTs, Highly similar to T46395 hypotheti 
vacuolar protein sorting 45B {yeast homo 
Human DNA sequence from done RP1-187J1 1 
ESTs, Weakly similar to S41044 c hrom osom 
Homo sapiens cONA FU 10500 hs, done NT 
Homo sapiens mRNA for K1AA1671 protein, 
chromosome 8 open reading frame 4 
piasiin 3 (T isotbrm) 

Homo sapiens, done tMAGE:3937015, mRNA, 
gbzh53d03.s1 SoaresJeteUiver_spieeTL 
FXYD domairwantaining ion transport reg 
Homo sapiens mRNA full length insert cON 
endomudn-2 

snail 1 (drosophfla homofog), zinc tinge 

K1AA1265 protein 

ESTs 

lysosomal 

Homo sapiens cDNA FLJ1 1333 fe, done PL 
ESTs, Moderately similar to 154374 gene 
Homo sapiens mRNA; cDNA OKFZp586N0121 (f 
ESTs 
ESTs 

uveal autoantigen with cofled coD domai 
Homo sapiens cDNA: FU21409 fe, done C 
Ernpiricaiiy selected from AFFX single pr 
Homo sapiens mRNA; cDNA DKF2p586l0324 (f 
hypothetical protein PR 0201 3 
phospholipid scrambtese 4 
gtezw37e0Zs1 Soares_totaLfetus«Nb2HF8. 
metal bthionein 1E (functional) 
K1AA1204 protein 

Homo sapiens cONA: FU21904 fis, done H 
hypothefical protein MGC4248 
ESTs, Weakly similar to I38022 hypotheti 
Homo sapiens cONA FU 14680 fis, clone NT 
ESTs, Weakly simBar to S64054 hypotheti 
DEAD/H (Asp-Glu-Ala-Asp/HIs) box polypep 
Homo sapiens mRNA; cDNA DKFZp586N0121 (f 
baculovirai IAP repeat-containing 6 
carbohydrate (keratan sulfate Gal-6) sul 
calpain 2, (m/I/) large subuni! 
netrin 4 

GTP-blnding protein 

ESTs 

ESTs 

Homo sapiens mRNA; cDNA DKFZp564F053 (tr 
Homo sapiens mRNA; cDNA DKFZp564F053 (fr 
carbohydrate (keratan sulfate Gal-6) su) 
transmembrane, prostate androgen induced 
guanine nucleotide binding protein-like 
transmembrane, prostate androgen induced 
Homo sapiens cDNA FU13613 Ds, done PL 
ESTs 
ESTs 
ESTs 

1 



type I transmembrane protein Fn14 
solute carrier family 9 (sodium/hydrogen 
CTL2gene 

enigma (UM domain protein) 
heat shock 70kD protein 6 
STAT Induced STAT inhibitor 3 
plasmalemma vesicle associated protein 
hypothetical protein PP1057 
vascular ceil adhesion molecule 1 
inhlbhDr of DNA tsrufing 2, dorrnnant neg 
dual specfficfty phosphatase 1 
growth arrest and DrM4amage-fnductb!e, 
secreted protein, acidic, cysteine-rich 
methionine adenosyltraitsferase II, beta 
conrective tissue growth factor 
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413731 AM49789 
129557 W01367 
129619 AA610116 
129827 AA25830S 
129762 AA460273 
129884 AA286710 
130018 T68073 
130147 D63476 
130178 M62403 
130282 X55740 
130431 L10284 
130495 AA243278 
130553 AA430032 
13C838 H16402 

130839 D59711 
130657 T94452 
130686 AA431571 
130776 R79356 
130818 AA28Q375 

130840 Z49269 
130899 Z41740 
131002 AA121543 
131080 J05008 
131084 AA101878 
131091 T35341 
131107 N87590 
131182 AA256153 
131207 W74533 
131319 U25997 
131328 V01512 
131509 X56681 
131555 AA161292 
131564 AM91465 
131573 AA046593 
131692 D50914 
131756 D45304 
131859 M90657 
131909 W69127 
131915 AA316186 
132046 AA384503 
132050 AA136353 
132151 AA044755 
132164 U84573 
132187 AA058911 
132303 AA620962 
132314 AA285290 
132358 X60486 
132398 R31641 
132421 AA489190 
132490 F13782 
132520 AA257993 
132546 M24283 
132610 AA443114 
132716 T35289 
132840 N23817 
132883 AA047151 
132958 N77151 
132989 AA480074 
132999 Y00787 
133071 T99789 
133076 W84341 
133099 109209 
133147 D12763 
133149 T16484 
133161 AA253193 
133200 AA432248 
13322) X82200 
133260 AA083572 
133295 L00352 
133349 N75791 
133391 X57579 
133398 XD2612 
133436 H44631 
133454 AA090257 
133478 XB3703 
133491 L40395 



BE243845 Hs.75511 
AL045404 Hs.46366 
AA209534 Hs^84243 
T40064 Hs.71968 
AA453694 Hs.12372 
AF055581 Hs.13131 
AA353093 

D63476 Hs.172813 
U 20982 Hs.1516 
BE245380 Hs.153952 
AW505214 Hs.155560 
AW250380 Hs.109059 
AF052649 Hs^52587 
AW021276 Hs.17121 
AI557212 Hs.17132 
AW337575 Hs^01591 
BE548267 Hs.337986 
AF167706 Hs.19280 
AW190920 Hs.19928 
BE048821 Hjl20144 
A1077288 Hs295323 
AL05Q295 Hs.22039 
NJi/L001955Hs^Z71 
NM_017413Hs.303084 
AJ271216 HS.22S80 
BE620886 Hs.75354 
AI824144 Hs.23912 
AF104266 Hs^4212 
NM_003155Hs.25590 
AW939251 Hs.25647 
X566B1 HS.Z780 



T47364 
T93500 
M040311 



Hs.278613 

Hs.28792 

Hs.28959 



BE559681 Hs.30736 
AA443966 Hs.31595 
AW960564 

NM_016558Hs.274411 
AI161383 Hs.34549 
AI359214 Hs.179260 
AI267615 Hs.38022 
BE379499 Hs.173705 
AI752235 Hs.41270 
AA235709 Hs.4193 
BE177330 Hs.325093 
AF 11 2222 Hs.323806 
NM_003542Hs,46423 
AA876616 Hs.16979 
AW163483 Hs.48320 
NM_001290Hs.4980 
AA257992 Hs.50651 
M24283 Hs.168383 
AA160511 Hs.5326 
BE379595 Hs.283738 
BE218319 Hs.5807 
AA373314 Hs.5897 
AF234532 Hs.61638 
AA480074 Hs.331328 
Y00787 Hs.624 
BE384932 Hs.64313 
AW946276 Hs.6441 
W16518 Hs.279518 
AA026533 Hs.66 
AA370045 Hs.6607 
AW021103 Hs.6631 
AB037715 Hs.183639 
NMJ»6Q74Hs.318501 
AA403045 Hs.6906 
AI147861 Hs.213289 
AW631255 Hs.8110 
AW103364 Hs.727 
NrVL000499Hs.72912 
BE294068 Hs.737 
BE547647 Hs.177781 
X83703 Hs.31432 
BE619053 Hs.170001 



connective tissue growth factor 
WAA0948 protein 
tetraspan NET-6 protein 

Homo sapiens mRNA; cDNA DKF2p564F053 (ft 
tripartite motif protein TR1M2 
lysosomal 
rnetaflothionein 1L 

PAK-interactirtg exchange factor beta 
insulin-like growth fector-bimCng prate 
5 nurieogdase (C073) 
calnexin 

mitochorufrial ribosomaJ protein L12 
pituitary turnor-transfonning 1 
ESTs 

ESTs, Moderately similar to 154374 gene 
ESTs 

Homo sapiens cDNA FU10934 fis, clone OV 

cysteine-rich motor neuron 1 

hypothetical protein SP329 

small inducible cytokine subfamily A (Cy 

serurn/glucocortjcoid regulated kinase 

KIAA0758 protein 

endothefri 1 

apeiin; peptide Ogand for APJ receptor 

dipeptidyipeptidase ili 

GCN1 (general control of amino-acid synt 

ESTs 

latrophifin 

stannbcalcin 1 

v-fos FBJ murine osteosarcoma viral onco 

jun O prato-oncogene 

interferon, aipha-wjucible protein 27 

Homo sapiens cDNA FU1 1041 fis. clone PL 

ESTs 

KIAA0124 protein 
ESTs 

transmembrane 4 superfamity member 1 

SCAN domain-containing 1 

ESTs, Highly sinter to S94541 1 clone 4 

chromosome 14 open reading frame 4 

ESTs 

Homo sapiens cDNA: FU2205O fis, clone H 

procollagen-lysine, 2-oxoghJtarate 5-db 

DKFZP58601 624 protein 

Homo sapiens cDNA: FU21210 fis, clone C 

pinin, desmosome associated protein 

H4 histone family, member G 

ESTs, Weakly simQar to A43932 mucin 2 p 

double ring-firmer protein, Dorfin 

UM domain binding 2 

Janus kinase 1 (a protein tyrosine kinas 

intercellular adhesion molecule 1 (CD54) 

amino acid system N transporter 2; porcu 

casein kinase 1, alpha 1 

GTPase Rab14 

Homo sapiens mRNA; cDNA DKFZp586P1622 (f 
myosin X 

hypofoeticaJ protein RJ13213 
tnterieu)dn8 

ESTs, WeaHy simllaT to AF257182 1 G-pro 
Homo sapiens mRNA; cDNA DKFZp586J021 (tr 
amyloid beta (A4) precursor-like protein 
interleukin 1 receptor-like 1 
AXIN1 up-regulated 
hypothetical protein FU20373 
hypothetical protein FU10210 
Homo sapiens mRNA full length Insert cDN 
Homo sapiens cDNA: FU23197 fe, clone R 
row density apoprotein receptor (farrriB 
L-3-rndroxya^Mk)erizyme A dehydrogenase 
inhibin, beta A (acfivb A, adivin AB a 
cytochrome P450, subfamily I (aromafic c 
Immediate early protein 
hypometfcal protein MGC5618 
cardiac an kyrin repeat protein 
eukaryotic transtafion initiation factor 
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133510 AA227913 AW880841 Hs.96908 pKMnduced protein' 

133517 X52947 NM_000165HsJ4471 gap Junction protein, alpha 1, 43kD (con 

133528 M11313 AU077051 Hs.74561 alpha-2-macrogk)bu{ln 

133538 L14837 NM_003257Hs.74614 6ght junction protein 1 (zona ocdudens 

133562 M50721 M50721 Hs.74670 H2.0 ProsophDaHIke homeo box 1 

133584 D90209 D90209 Hs. 181 243 activating transcription factor 4 (tax-r 

133590 T67986 T70956 Hs.75106 clusterin (complement lysis inhibitor, S 

133617 AA148318 BE244334 Hs.75249 ADP-rtbosytafcn factoMike 6 interacfj 

133651 U97105 AI301740 Hs.173381 dihydropyrimtfaTas^ite 2 

133671 T25747 AW503116 Hs.301819 zinc finger protein 146 

133678 K02574 AW247252 nucleoside phosphorytase 

133681 D78577 A1352558 tyrosine 3^rrcoxygenase/tjyptophan 5-mo 

133722 X53331 AW969976 Hs.279009 matrix Gla protein 

133730 S73591 BE242779 Hs.179526 upregulated by 1,25^draxyviianiin[K3 

133750 X95735 BE410769 Hs.75873 zyxin 

133802 L16862 AW239400 Hs.76297 G protein-coupled receptor kinase 6 

133825 U44975 BE616902 Hs.285313 core promoter element binding protein 

133838 M97796 BE222494 Hs.180919 inhibitor of DNA binding 2, dominant neg 

133859 U86782 U86782 Hs.176761 26S proteasorne-assodated pari! homoiog 

133889 AA099391 U48959 Hs.211582 myosin, Bght polypepfide kinase 

133960 M19267 M19267 Hs.77899 tropornyosin 1 (alpha) 

133975 D29992 C18356 Hs.295944 fcsue factor pathway inhibitor 2 

133977 L19314 AIJ25639 Hs.250666 hairy (Drosophila)4ramok)g 

134039 S78569 NM_002290Hs.78672 iaminin, alpha 4 

134075 U28811 NM_012201Hs.78979 Golgi apparatus protein 1 

134081 L77886 AL034349 Hs.79005 protein tyrosine phosphatase, receptor t 

134164 C14407 AW245540 Hs.79516 brain abundant, rnembrane attached signal 

134203 M60278 AA161219 Hs.799 diphtheria toxin receptor (hepanrvbindt 

134238 R81509 AA102179 Hs.160726 Homo sapiens cDNA FU1 1680 fis, clone HE 

134299 AA487558 AW580939 Hs.97199 complement component C1 q receptor 

134332 D86962 D86962 Hs.81875 growth factor receptor-bound protein 10 

134339 AA478971 R70429 Hs.81988 disabled (Drosophfla) homoiog 2 (mitogen 

134343 050683 D50683 Hs.82028 transferring growth factor, beta recepto 

134381 U56637 AI557280 Hs.184270 capping protein (acffn filament) muscle 

134403 M61199 AA334551 sperm specific antigen 2 

134416 M28882 X68264 Hs2 11579 melanoma cell adhesion molecule 

134493 X15183 M30627 Hs.289088 heat shock 90kD protein 1, alpha 

134558 S53911 NNL001773Hs.85289 CD34 antigen 

134817 U20734 AU076592 Ks.198951 jun B proto-oncogene 

134983 D28235 D28235 Hs.196384 prostoglandirhendoperoxide synthase 2 (p 

134989 AA236324 AW968058 Hs.92381 nudix (nudeoside diphosphate linked moi 

135052 AA148923 AL136653 Hs.93675 dedduai protein induced by progesterone 

135062 AA1741B3. AK000967 Hs.93872 KIAA1 682 protein 

135069 AA456311 AA876372 Hs.93961 Homo sapiens mRNA; cDNA DKFZp667D095 (fr 

135071 L08069 W27190 Hs.94 DnaJ (Hsp40) homoiog, subfamily A, membe 

135073 AA452000 W55956 Hs.94030 Homo sapiens mRNA; cDNA DKFZp586E1624 (f 

135170 AA282140 T53169 Hs.9587 Homo sapiens cDNA: FU22290 fis, done H 

135196 J02854 C03577 Hs.9615 myosin regulatory light chain 2, smooth 

135348 AA442054 U80983 Hs.268177 phosphoGpase C, gamma 1 (formerly subty 
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TABLE 4A 

Table 4A shows the accession numbers for those pkeys lacking unkjenelD's for Table 4. The pkeys in Table 7 lacking unigenelD's are represented within 
Tables 1-6 A. For each probeset we have feted (ha gene duster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubleTwisL Oakland California). The Genbank accession numbers for sequences comprising each cluster are listed in the 'Accession" 
column. 



Pkey: Unique Eos probeset idenHier number 

CAT number Gene cluster number 
Accession: Genbank accession numbers 



Pkey CAT Number Accession 

100752 33207.21 T81309 BB019033 R94181 BE019198 NM.Q00612 J03242 AW411299BE300064BE297544 R94182 AW630108 T53723 
058853 H78073 H80594 BE299560 T48899 H70196 M17426 N77077 S77035 H58384 H61664 H78540 T84527 C17198 
H60255 H71980 R92644 W79050 X00910 M29845 R91055 M17863 M17862 T71815 BE299561 BE464561 X06260 
R94741 T54216 C18594 BE262015 X06161 AW409889 AA378400 BE263228 BE313278 R881 16 BE313457 H43500 
T48617 BE313761 H77309 A1207601 X08159 H40413 X03425 T87663 R10627 X03562 M1411B W03982 R97520 H81229 
T83157 H83168 H48762 AA669898 BE263054 H47289 AA022807 R1 1555 H74260 R76968 R28338 H72534 H72464 
H52031 N72478 N45355 AW411300 R89113 R69135 H58454 T83281 R93476 H69645 H68015 TB2229 H71089 T85121 
H59939 W65299 N78176 H53909 N72373 R21788 H04660 H59639 H61874 BE262219 T53614 N73335 N50464 W00943 
N77189 R89257 AA570502 R89432 R06366 AA553480 AA776271 AA551359 AA551050 H51670 AA601052 BE299081 
H68198 H52276 BE207832 N91 192 H70332 XD7868 X07868 H69464 H53782 H73710 R80435 AA553384 AW884176 
N53475 T71662 AW954036 AW954033 AA552931 H93206 AA430218 AA553476 AI918470 T54124 BE207982 BE300177 
N73994 AW882625 N39549 N53838 AA722389 H71878 H58909 H37849 H78435 T47933 R77174 R83814 AA41 1890 
H94199 AA663208 BE205778 AA490137 H70492 R98232 H37800 AA679294 H40341 H74238 H47290 H73231 T4861B 
AA025428 AI039521 H92959 N59389 H80538 H72933 T90630 AA411891 N55000 H74225 AA340290 AW9570S1 T54316 
AA340437 H57125 H58908 H79027 H63450 N74623 R93425 H68714 H68758 N68396 H48763 N69256 H57320 H53831 
H53589 N68833 N52453 H56048 H69870 H78074 R69253 R83375 T53615 H94330 H58455 H90864 T47934 H74261 
R89258 R97997 R91056 R28339 R85760 H78235 R97521 H67692 H40358 AA022688 H52513 H59601 T88690 H65256 
H63397 W65397 AA553588 R19280 N52645 W73930 R06367 R21743 H72372 N73921 AW883539 AW882639 T40616 
H47084 R95723 AA634316 AA862781 H77310 R91389 H33111 R92767T54512R89341 H70333H57817 H82941 H62032 
N52638 H58385 T91796 H51086 AA340292 T49918 H81230 R36121 N50411 T87664 N62436 N39340 AA665637 
AA340446 H93377 H92973 BE29S290 BE269788 H51665 AA340444 N54605 AA454101 R10628 R94200 A1200549 
AA342640 BE298855 BE250229 T49916 H82008 N28278 AW880662 H71268 N76791 H47685 H65255 W05198 
AW889144 M76577 H71702 H68036 H71915 R91612 R87807 H68059 Al 133328 A1247866 AA621443 AW881050 
AA700847AA340413 AW878SO8AW881101 AW878249 H71916N54596BE161581 AW878082 W04212AW881040 
AW885492 AW880519 AA334887 AW878715 W06882 AW630222 AW885381 H70869 AW381778 H47601 AW889982 
H63868 AW884986 AW878713 AW878685 R38391 AW878694 AA368D70 C03393 AW878695 AW878705 AW878665 
AW878742 AW878620 AW878823 AW878688 R29048 AW878690 AW878686 AW878810 AW878827 AW878733 
AW878S59 AW878749 AW878681 AW883353 AW883277 AW883300 AW883565 AW883298 AW883143 AW883045 
AW883482 AW883352 AW883417 AW883357 AW883231 AW883474 AW883355 AW882620 AW882533 AW883754 
AW883139AW882827 AW883641 AW883567AW883481 AW882983 AW882982 AW882465 AW883419 AW882466 
AW883639 AW8B3230 AW882981 AW882534 AW882874 AW882619 AW883480 AW882826 AW882B31 AW882835 
AW882830 AW883563 AW882456 AW627642 

117156 145392J W73853AA928112 W77887 AW889237 AA148524 AI749182 AT754442 AI338392 AI2531Q2 A1079403 AI370541 AI697341 
H97538 AW188021 AI927669 W72716 A1051402 AI18B071 AI335900 N21488 AW770478 W92522 AI691028 AI913512 
A1144448 W73819 AA604358 N28900 W95221 AI868132 H98465 AA148793 

131859 3672.1 AW960564 AA092457 T55890 D56120 T92525 AJ815987 BE182608 BE182595 AW080238 M90657 AA347236 AW961686 

AW176446 AA304671 AW583735 T61714 AA316968 AI446615 AA343532 AA083489 AA488005 W52095 W39480 N57402 
D82638 W25540 W52847 D82729 D58990 BE619182 AA315188 AA308636 AA112474 W76162 AA088544 H52265 
AA301631 HB0982 AA113786 BE620997 AW651691 AA343799 BE613669 BE547180 BE546656 F1 1933 AA376800 
AW239185 AA376086 BE544387 BE619041 AA452515 AA001806 AA190873 AA180483 AA159546 F00242 AI940609 
Ai940602 AI189753 T97663 T66110 AW062896 AW06&10 AW062902 AI051622 A1828930 AA102452 AI685095 AI819390 
AA557597 AA383220 AI804422 A1633575 AW338147 AW603423 AW606800 AW750567 AW510672 A1250777 AA083510 
AW629109 AW513200 AA921353 AI677934 AI148698 A1955858 AA173825 AA453027 AI027865 AW375542 AA454099 
AA733014 AI591384 R79300 R80023 AA843108 AA626058 AA844898 AW375550 AA889018 AI474275 AW205937 
AI052270 AW388117 AW388111 AA699452 AI242230 N47476 H38178 AA366621 AA1 13196 AA130023 H39740 T61629 
AI885973 AW083671 AA1 79730 AA305757 AI285455 N83956 AA216013 AA336155 AW999959 T97525 AA345349 
T91762 AA771981 AE85092 AJ591386 BE392486 BE385852 AA682601 AI682884 AA345840 T85477 AA292949 
AA932079 AA098791 D82607 T48574 AW752038 C06300 

125565 1704098.1 R20840 R20839 

133607 1227.6 BE273749 BE397561 BE387189 AL037858 AL037878 AI963094 BE259216 AA01 1363 AL036189 BE562325 AA251169 

BE617431 N98537 AA158093 ALO4780O M34539 NM.000801 AA312140 D16971 AA158904 AA3071 14 AA312803 
T09203 AW629686 AL04B504 BE3B8578 AA220957 AA158364 BE267385 AA294971 C 18055 BE241757 AA1 15056 
AI936769 BE378435 BE206971 AW674924 BE622060 AA604674 AA1 15273 AW402159 AA338608 BE568819 M80199 
X55741 AA375111 AA376016 BE612671 AA805742AW405588 N25850 N44580 H06031 AW403549 BE536552 AA056726 
BE543239 AA082517 AE01645 A1201642 Al 192622 N40104 AA370921 BE547569 AB59502 AA302038 AI197890 
AW268354 AI014938 W45448 A1541395 AA037272 BE538826 AL039813 BE536130 AA299355 AW805147 AW974624 
H53220 AI471471 AA399303 AA007386 W35106 BE613277 R12739 R12738 AA304342 AA687802 BE409581 AI498844 
AV662092 AW904105 AA011375 BE315214 H99302 BE537893 N32299 AW855B29 AI291320 BE078322 AI301395 
AA3033S2 N32719 AA358328 AA357877 AI952540 H56279 H02758 H02048 AW805233 R82224 AA410772 AA291352 
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134403 17037.1 



126872 142696J 



121335 
130018 
70 121822 



279548 1 
18966 1 
244391J 
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BE171 109 N69935 BE169248 AA361173 H44978 BE617887 D52560 AA084043 W03595 R67219 N36477 N42924 R67104 
H44901 H79695 W21 105 AA393988 W30899 AA31 6096 BE522B96 W46872 AA442678 BE544893 BE5401 12 BE621873 
AA338067 N55052 BE398154 BE621210 AA740760 C03739 C03206 BE398692 AA482370 AA031614 AA301575 
AA304710 AA132153 AAD29796 AA994960 H19567 AA442969 H49781 H46871 AAQ35395 AA056185 AA149378 
AA643080 AL135479 AA292329 AA654337 AA041228 AA454888 AA025039 W58331 AA625981 T94941 AA302448 
H199Q0 AA21B956 AA513790 AA563962 AA398076 W44441 AA293276 W47373 AA625879 W30688 AA043029 T64284 
R79151 AA304340 AA485185 AA604939 R82470 AA421425 AW771456 AB39329 AA304424 AA605236 AA936934 
AA587673 AI209162 A1697301 AI479995 AI679814 AI361950 AW189125 AI955888 AI986019 BE301019 AI084792 
AI31Q211 AW189307 AI022070 AW977204 AI146825 AW190163 AW303281 AI828345 BE046043 AW029257 AA482268 
AI246507 AI420729 AW084932 AW439514 AI890487 AW439692 AI523896 AI186612 AI659953 AI889773 AA687527 
AW072694 AW262153 AW467371 AI613269 A1679238 D54404 AA158103AW105527 AW149739 AW150361 AW268387 
AW1 17708 AI951682 AI687440 AW674285 AA678365 AI587082 AA732095 AA01 9899 W45661 AAB27300 BE61 3304 
AA765891 AA612935 AB14658 AW316916 R66594 AA514640 AA025040 AA031472 AW732076 AA029797 AI244560 
AI128734 AW381720 AI092360 AB63283 AW613175 AI890675 AI720156 AW531348 AI635106 AI278045 AA303979 
AA703505 W45449 AW078661 AI292052 AW381707 AI147854 AW381743 AA1 58905 AA303258 AA888144 AW195967 
AA428705 AA989559 AA617731 H19882 BE543418 AAB30386 AA421302 W58652 T94995 AI869743 AI679145 
AW085971 N98425 AA765136 AI347027 A1356955 AA928038 A1679717 AA458459 AA679281 AI367973 AI270041 
AA765135 AA732793 AI798447 M668646 AA251008 AB84538 AI401737 AA0561 86 BE043308 AW662375 AJ3021 10 
N50724 W96332 BE537047 N26983 AI567172 AA765296 AW673237 N29784 AA534275 AA084044 AW067973 
AW300766 T63398 W46823 R39790 AJ364185 AW298582 AA454814 AW069878 N67751 H05982 N23140 AB62647 
AI302086 AI767772 N25755 H531 14 AA706133 T93511 AA429291 AA935294 AA987647 W02803 R66595 AI680795 
W23673 AW440794 AA722872 H49538 AW131042 AA531603 AA908665 AA040791 AA235312 W52205 N93444 R82180 
H02759 H79696AW088894 H56079AA961143AW067776AW973745 AA016311 AW071227 AA017511 AI753994 
W47374 T64155 AA296092 AI698626 AA558158 AA296088 AW794259 HOI 963 AA149267 AA485076 AA975856 H44938 
AA035396 AI955555 H46289 AA486161 AI631222 AA359047 AW794253 A1806962 AW243930 AA526145 AW878734 
AA018464 AA132031 R67220 R79152 AA296093 H54300 AI00516O BE242548 AW992803 AW878644 AWB78666 T27742 
R82471 AW517604 AW472738 AI282904 R39791 AA486098 AW467891 AW960520 AA551736 AA056621 AW945197 
R66373 AA554236 BE242202 AI904376 A1832590 H 19484 R00890 AI627677 AA302287 AIB69451 AI734855 AI708073 
A1832902 AA585184 AW204299 AA055565 D12417 D11975 T63543 AW664099 R54423 BE612712 T95340 T63985 
AA598917 T40735 T64053 AA149284 AW272548 AA363445 AA042893 AW300697 BE261973 T53501 T53500 AW878729 
AW878657 AW794391 AA069193 R01553 H44875 AA385406 AA533968 M93060 AL135600 W96331 AA017651 
AA018849 AA017692 H85337 BE278690 AA73159B AA018512 AI076813 AI022644 R025B5 X52220 AW296894 AA825671 
AI699321 AI393601 AW592511 AI146747 AA608921 AA158365 AW590007 AA354519 D20081 R02704 AW798339 
M92422 AA094903 AA007676 

AI352558 Z82248 X78138 NM.003405 AU077248 AA223125 S80794 D78577 A1124697 AW403970 BE614089 BE296713 
BE621334 120422 X80536 D54224 D54950 X57345 N29226 AA127798 AA340253 F08031 AA192540 H67636 AA321827 
AW950283 AA084159 BE538808 AW401377 AA256774 C03366 W46595 W47608 AA305009 H69431 H69456 AL120082 
H1 1 706 AA303717 AA361357 H22042 H78020 AW999584 AA134368 AA32291 1 AA322961 H60980 N85248N31547 
H79624T11718 W85826 AW894663 AW894624 BE167441 BE170015 AA304626 AW602163 AW998929 AA1566B1 
AA151067 BE002724 AA608688 H82692 BE155392 AW383636 BE155394 AA487004 AW383504 AI342355 R82553 
W16498 BE155344 AI143938 R69901 AA322873 AW340648 R25364 AA367935 AI559406 AA033522 AA374252 
AWB35019 A1922133 AI697089 N99662 AW189078 AI199076 AW151598 W59944 AA662875 W94022 AA299055 
AI03900B AI829449 AA583503 A1635674 AW131665 AW73820 AW2731 18 AW900930 AA908944 AI688035 AW17Q272 
AKJ82545 AW468176 A1608761 AI082748 A1911682 A1243943 AI831016 AA192465 AC18477 AA93B408 AA385288 
AI809817 M905196 AI191245 AW70204 AI188296 AI421357 AI125315 AI087141 AA629032 AA740589 AI554181 
AA150830 AI248541 AI077943 AA775958 AA864930 A1261476 A1123121 AI310394 AA862331 AA87247B BE537084 
AI205606 AA720684 AI872093 AW150042 AL120538 AA219627 AA988608 C21397 AI359337 H25337 A1089749 
AA605146 AI359620 AA150478 AI359738 AW383642 AW995424 AI766457 R56892 AI089839 W61343 N69107 W46459 
AA555955 N20527 AI279782 W46596 AA776573 H23204 AI866231 AI083995 N21 530 AA1 26874 D82630 W65437 
AI086917 AW382095 AI086877 H69844 AW340217 W85827 L08439 AA262704 AA505380 W47413 W94135 AA223241 
AW089153 AA084101 BE538000 AA096126 T28031 AA491574 R84813 AA774536 AW383522 AA155615 AW383529 
AA49152DAW028427M171496AI469689AW664539A1811102AI811116BE464590BE350791 H78021 T15405H21979 
AA219489 H13301 AA505883 AI864305 AM23963 AW084401 F04963 R69858 H67097 A1917740 AK55561 H69864 
AA033631 AW383484 A1886261 H25293 AA513281 AW271 187 H1 1617 N79982 AI174338 AI9O4207 AI904208 BE614558 
W94127 W65436 AI272249 AA700018 AI579932 A1085941 AW152629 

AA334551 BE008229 AA307537 AW961156 AW995894 AW995826 NM.006751 M61 199 AA045603 AL036372 AV5456C5 
AI688095 AW351901 AA101337 AA101345 N73342 BE018030 BE559044 AW841975 AA373388 BE090412 H95440 
N53845 R67867 AA093441 AA363427 H93708 AW023134 AW994986 AW994989 BE090429 R23614 A1567932 H03726 
H01 101 HOI 857 AA548743 AI671B06 AW872949 AW872941 AA742447 A1199788 AA045604 AI537465 AI741796 
AW242217 AW131463 AI765302 AI683923 AA889762 AIB04889 AI986437 C06049 BE502340 AI695651 A1491970 
AA496804 AA281008 AA665699 AI473814 BE301445 AA707837 AA551925 A1017348 A1208185 AA775203 AA156296 
AA557463 H95441 AA768547 AW769358 AA991197 AA181954 AJ091389 AI147289 AW771837 AI638582 AA84441 1 
AJ374750 T29320 AW951272 AW085923 H02834 AA843259 AA814696 AW183290 AA158453 N68125 N69039 AA100423 
AA101346 AI918720 HO1 102 R67868 H01868 N66438 R46580 A1858433 AA599560 AA187577 AA157481 AA361520 
AL047827 AA158452 R21688 AW964874 AA325161 R40871 AW752395 AW375924 R13355 AA2B1 174 AA428908 
AW450979 AA136653 AA136656 AW419381 AA984358 AA492073 BE168945 AA809054 AW238038 BE01 1212 BE01 1359 
BE011367 BE011368 BE011362 BE01 1215 BE011385 BE011363 
AA404418A1217248 

AA353093 AW957317 AW872498 AJ560785 AI2891 10 AW135512 X97261 T68873 

AI743860 N49543 AWD27759 BE349467 AI656284 BE463975 R35022 AA370031 AW955302 AL042109 N53092 AI61 1424 
AL079362 AI969290 AI928016 BE394912 BE504220 BE467505 AI61 161 1 AI611407 A161 1452 W56437 AI284566 
AI583349 AW183Q58 AJ308085 AW74952 AA437315 AA628161 AW301728 AI150224 AA400137 AA437Z79 AI223355 
AA639462 AI261373 AI432414 AB84994 A1539335 AA401550 AA358757 AI509976 AA442357 AA359393 AA437046 
AA370301 AA429328 AW272055 A1580502 AI832944 A1038530 AA4251 07 AI014986 AI148349 AW237721 AW779756 v 
AW137877 AI125293 AA400404 R28554 



AA608588 
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123533 genbankJ^A508751 AA508751 

125091 genbankJ91518 T91518 

123964 genbanlLC139S1 C13961 

102491 entre*JJ51010 U51010 

118475 genbankJJ66845 N65845 

1 18581 genbanJeN58905 N68905 

113947 genbanlLW84768W84768 

101447 entre*_M21305 M21305 

101667 13349J NMJJ05381 M60858 AW373732 AW373724 AW373689 AW373629 AW373609 AW373776 AA187806 AW386946 

AW374207 T05235 AA216203 AW385556 AA305940 AA306526 AA315461 AL036757 AW373711 AW403124 AW403640 
AW377084 T27360 H62638 F06957 AW377051 AA554779 AA378568 AA096007 AW352407 AW302637 F07929 H1 7433 
AW382712 H05665 F07292 N39875 AA089729 H62556 N42842 R12952 AW373735 AW364155 AA056183 W39185 
AW382708 N32488 AF1 14096 AW375993 A1133569 W52561 AA603040 AA133710 AI928795 AW176370 AA827519 
AW338437 AA521 142 T29341 AI800461 AW317002 AA703914 AA860830 AI859203 AI445772 AA714334 AI817066 
AI832027 AW510442 AI635602 AW088306 AW068672 AW408555 AW467542 AA552657 AA152367 W32081 AA582124 
AA074040 AA931657 AI051154 AW410203 AI921644 H17434 AI832330 AW404836 AI925038 AA088423 AA954166 
AA5B0453 AW021292 A1267215 AW0B0082 AW383778 AI933053 AI919097 W31557 N90245 AA931591 AA563995 
F36352 AA0561 84 AA476294 AA641327 AA533550 AI749S30 W58323 AA5691 19 AA508573 AI809050 AB78996 
AA411382 AW407505 AA938104 AA074041 AA632876 AW193748 AA507873 AI270128 AI472365 AA411363 AI523216 
AI719965 AE816302 AA182681 AI707990 AA133588 AI758537 W60253 AI460308 AA135423 AI083904 F04188 N89693 
AW408776 AI678595 AI270568 AA722059 W58234 F33550 AA090547 AA2851 08 AA425981 N85079 D20218 AC73980 
AA159028 F03228 AW247914 N26918 AW272741 N901Q9 H05666 N23327 AW247953 R44748 AA962015 F03558 
AT752394 AW409913 AW248396 AI816463 AI752393 AA325370 AA263089 AI570130 AI971951 AJ 160658 A1357350 
AW168686 AL121075 AW050536 N21672 W67748 AA514242 AI127386 H14607 A1185752 W79364 AA088520 AA152476 
AW351940 AW373683 A1940524 AW374953 T56500 N24329 AI940720 AW374933 AW374947 AW391913 AL138337 
AW376241 AW062943 F26666 AW410202 AW062958 F34529 AW381807 AW393315 W17147 AW176359 AA664576 
A W380424 AA306040 AI745674 AW300951 AI188579 AM38973 AJ305271 AA433818AA612807 AJ8318Q9AI9404O9 
AA1 58663 AI572988 

108931 genbanUVA147186 AA147186 

103138 entreOC65965 X65965 

103432 entreO07748 X97748 

119174 genbanK.R71234 R71234 

133678 11235 J AW247252 AA346143 NH.000270 AA381085 N91995 X00737 AA381079 AA296473 AA2961 10 AA315735 AA31 1617 

AA326750AA376804 AW403290T95231 M13953 T47963 H82039 AA279899 AA627997 N76320 N99527 H37842 
W20095 AA457308 AW469547 AA724143 H83220 AA319496 W86334 W30892 R89169 R99427 N41854 H47286 
AA348094 AA045089 R6301 6 A192221 9 AKJ24906 A1096488 A1885005 AA1 94872 N90489 AI452544 H7241 1 AA282427 
AA430735 R68963 R22453 H70385 AW129369 AW467320 AW519082 AA345018 AA582183 A1961789 R65918 N3061 1 
A1979189 AI280889 AW273191 R66531 AI285845 AI675927 AW21990 AW190879 H37794 AA699667 H68427 AA954388 
AI188757 AI140048 AA430382 AI204151 AW247864 AA559099 AI431420 AA548276 A1149466 AA772669 AA694388 
AA724168 AA301651 AA281952 AA779925 AA234760 W86290 AA913603 AW511745 AI500697 AA814922 AA835040 
T47954 H53998 AA975804 R98710 AI077604 N70252 R98084 AW250171 H69268 A1597614 AA970746 AA972548 
AJ377116 R62962 H16737 RB9070 AA731329 R66532 N54354 AI818832 H81944 N71567 T95122 W86463 AA437095 
AM31999 AJ915724 N63851 AI674743 AA457307 AA21 1475 N64444 AJ799146 H72853 R99335 H60413 AA770367 
AA156105 AI269937 H64Q29 H89728 R65819 AW470496 A1873318 AI735713 H82987 C02447 AI478666 T27651 
A1699770 AW025156 H69719 A1984717 N69225 A1459856 AA953577 A1424691 H13843 R22404 AB73796 AI336002 
N70898 AI420854 AA541792 AA346142 AI000814 A1828348 AA045090 T51257 N90434 H13890 N73184 AI708083 
AA781606 AA329050 AA339985 R68964 H64795 W04186 H16845 

119416 genbanLT97186 T971B6 

119559 NOTJOUND_entrez^W38197 W3B197 

123473 genbanlLAA599143 AA599143 
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TABLE 5: 

Pkey: Unique Eos probeset Identifier number 

Accession: Accession number used for previous patent filings 
ExAccn: Exemplar Accession number, Genbank accession number 
UnigenelD: Unjgene number 
UnigeneTifle: Unjgene gene title 



Pkey Accession ExAccn UniGene UnigeneTttie 

113819 AA426573 AA486620 Hs.41135 AA466520 

132837 D58024 AA370362 Hs.57958 AA370362 

101545 M31210 BE246154 Hs.154210 BE246154 

102838 X06256 NM.002205HS.149609 NWLQ02205 

101192 L20859 BE247295 Hs.78452 BE247295 

102915 X07820 X07820 Hs.2258 X07820 

105330 AA234743 AW338625 Ks22120 AW338625 

107385 U97519 NM_005397Hs.16426 NMJJ05397 

102024 U03877 AA301867 Hs76224 AA301867 

134416 M28882 X68264 Hs.211579 X68264 

103036 X54925 M13509 Hs.83169 M 13509 

104865 AA045136 T79340 Hs.22575 T79340 

106124 AA423987 H93366 Hs.7567 H93366 

105330 AA234743 AW338625 Hs.22120 AW338S25 

109001 AA156125 AI056548 Hs.72116 A1056548 

104764 AA025351 AI039243 Hs.278585 A1039243 

133200 AA432248 AB037715 Hs.183639 AB037715 

105263 AA227926 AW388633 Hs.6682 AW388633 

105178 AA187490 AA313825 Hs.21941 AA313825 

109456 AA232645 AW956580 Hs.42699 AW956580 
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TABLE 5 A 

Table 5A shows the accession numbers for those pkeys lacking unigenelffs for Table 5. The pteys In Table 7 tacking unigenelD's are represented within 
Tables 1-6 A. For each pro beset we have fisted the gene duster number from which the oligonucleotides were designed. Gene clusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
Alignment Tools (DoubteTwist, Oakland California). The Genbank accession numbers for sequences comprising each duster are listed in the "Accession' 
column. 



Pkey: Unique Eos probeset identifier number 

CAT number Gene duster number 
Accession: Genbank accession numbers 



Pkey 
115819 



CAT Number Accession 



10241J AA486620 AF205940 AA297524 AB034695 AA081335 NM.016242 AA188323 AA297537 H88204 AW953081 W31695 

AW582203 AA248250 AW68121 1 AA426230 AA4648D7 AA426155 N44141 AA347390 AA770661 AI333225 N36136 
AW665724 AA431894 AI374976 AM00254 AJ338446 AA1 86695 H88205 W04527 AA487056 AI051414 AA918383 
AA426573 AA425620 AW438654 AA090513 BE 167284 BE167291 AI301726 
102024 14505.1 AA301867 AW957981 R27614 AA1 55808 A1920990 AI74071 1 AA301026 AA301015 A1220981 AI857670 A1537140 

AW015210 AA030000 W46890 H44021 AI355967 AI651735 AA058479 AA146932 T58265 R85890AA047810 AA017387 
AW026093 AA971133 AI827263 AI056416 AI355994 AI127691 H46603 U03877 NIVL004105 AA157357 H42844 
AA146824 AA187709 AA187269 AA304348 AA147292 AA361687 AA15S041 AA330636 R32929 AA321130 AW950260 
AA082157AA029129 AA303708 AA028155 D31561 T84689 AA302493 BE 153057 BE153181 W39408AA1 87200 
BE 153250 AW383337 AW382622 AW382647 AW750072 BE 1 53060 AW3 82630 AW371865 AW392464 AW382664 
AW382658 AW382550 H61647 AW365075 AW365049 AA373397 BE072779 BE072781 230254 W24381 BE153254 
AA040442 BE072729 BE072731 N94740 AA146945 AW802737 AI826799 AI085395 R34034 H65140 AA082800 H88275 
AA147824 R63882 WB0899 AA296413 AI765300 AI862426 AW022055 AW300003 AI743784 AI862635 AI985428 
AA147764 AW573245 AW190290 AI040898 D57613 N63457 AA148082 AI028458 AA1481 10 AW814489 N75105 
AW629443 AA704122 AW582220 AA181240 AA057495 AI418224 AI261751 AW388595 A1472205 AW470672 AA102546 
AA789046 AA182416 AA062668 AW300732 AI288220 AA181982 AA146B25 AA028130 AI985522 AA303344 AA081313 
N69082 AA182035 AJ867128 AA100902 AA605087 N67178 AW020324 AW890446 AI472191 AI335691 AI597837 
AI081 143 AI335681 AA040443 All 28067 AI678244 AA018303 AA157260 W80792 AI934590 AJ096430 T54343 AI446350 
AA165196 AA78Q883 AA603631 AA047787 AA9685B0 AA912645 AW890504 AW026913 D56983 H52088 AA156121 
R30848 AW023036 AI590960 N67345 AI753225 AI753283 A1183768 AA147818 H89101 AI362141 H89205 AI14771 1 
AA321 129 AA6S8622 AA343479 AW069438 A1422376 AW629270 AA013413 AI221948 AA970605 M52335 H38366 
T91 160 AA657841 AA017386 AA152227 AA187593 AI913340 AI719313 A1969943 AT701271 A1004328 A1868348 N93659 
H65Q93 H25736 D57007 D56957 C00987 D61839 D56661 A 14721 37 AI971002 D56971 BE048830 D57972 AI589286 
AI361055 AI361071 A1292223 AA155898 D57139 D57981 D57345 AI420034 D57332 D57959 AA875933 R33493 N67558 
D58353 AA18B394 AA147966 AI160640 AI363165 H40638 AA578137 AW950265 AA300943 AI128999 H46584 AA917355 
N57820 AA320504 H51959 H25737 

101545 24607J BE246154 M31210 NMM001400 AA193392 NM_016537 AF233355 AF022137 H27787 AA370448 F05373 T27666 W21494 

AA036907 AI249966 N93476 F01623 AA304390 AA308808 

109456 180633J AW956580 AA886361 AI147670 AI090115AI168683 AA232645 H99504 AA374707 AA380875 AW139567 AI735132 
BE439385 AW629780 N28322 AA232789 AA232790 N73285 

103036 17145J M13509X54925 NM.002421 M16567 X05231 M15996 W39354 AA186634 AA852324 AA187507 AA081149AA1B6524 

AA187264 AA187361 AA386155 AA186973 AA374217 U78045 AA081230 AA188049 AA1B6393 W56827 AA852602 
AA157468 AA308204 AA186754 AA186808 AA082516 AA304334 AW376428 BE439384 AW376420 AA156273 T 18504 
AA186521 W49496 AW084608 AA083575 AA372360 AW963590 AA132297 W47445 AA186376 AA157628 AW003999 
AJ037890 AI858060 AI589010 AI743739 AI452673 AW304188 AW117854 BE439933 AA157416 AW778966 AJ038497 
AA081005 AA10O829 AA181048 C02231 T27821 W23960 AW954802 A1471432 AW801296 AW801289 AW801 603 
AW801523 AW801292 AW801542 AW801601 AA181134 AI445147 AA191501 AA582862 N94407 AI147810 AA181880 
W49497 W52714 AA188249 AI932881 AI082493 AA503656 AA182682 AW801393 AA182830 AA181882 AA182826 
AI613182 N94510 W47343 AI085755 AI076956 AI918426 AA081208 AI282835 AA147528 AI081490 A1654536 AA181875 
AA081282 AA186389 C06085 AA083542 AI800644 AA157642 AA101069 AA157752 AA15B121 AA143331 AA081283 
AA852603 AA188296 A1932880 AW449628 AA187348 C02091 AA514656 AA082736 AA308766 AA143201 M16567 

133200 28950J AB037715 AI351347 AI375796 AI884765 AL121 124 W01068 AI807275 T95240 R42807 AW515645 A1057314 AI033520 

AA057671 N70215 AA054215 AW204183 AA552149 T95130 AW796310 AI866520 AW275564 AW796308 AI637901 
AW197404 T78406 AA456232 AW20S463 AA779800 AI052696 AA026744 AA454623 AW470729 R45490 AW770258 
AI038393 AT290170 AA722734 AL121 125 R41608 AI862414 AA838611 R45582 AI278083 BE466849 BE219944 
AA418030 BE041555 AA578572 T16528 AW006344 Z39782 AI244848 A W1 37344 AA707400 AI032028 BE540464 
AI094265 A1184281 AA931890 AW382744 AW382729 AW020448 AW827237 AA431 226 A1672059 AW772345 N70172 
AW022003 AI862704 H19344 R61511 AI080204 H16566 AA432248 AI767980 T16688 AI984342 AI217478 AI767095 
238551 AI359566 AI361437 AI041000 R07033 H 16608 H19054 R12874 R61567 N98368 BE221 199 242320 AA094554 
R07078 AW860886 AA418090 R41262 

132837 256666 J AA370362 AA3641 10 AW959554 AW371737 AW382068 AW604716 AW604713 AA487827 AW371674 AA429137 

BE503321 T93570 W72803 AI093076 AA487977 AT241562 BE439445 AW204065 R51635 A1802994 T10362 W68553 
AI866215 AW152154 AA700716 AI127443 R15824 A1537587 AA9531 10 D58024 AI52081 1 AA693570 A1453280 W76329 
AW023955AW022563 

102898 24023 J NM.002205 XD6256 M13918 BE070866 AW239485 AW996127 BE273894 BE272590 BE410252 R25975 T1 1786 T1 1787 

AA301 142 AA301165 AW960506 BE272819 AA386086 T39391 AA285303 AA370580 D58585 T5B668 AA156213 W24142 
AA343323 AW795067 AA1 51 1 97 AA3761 21 R94782 AA302363 HS0357 R82621 AA301677 H55997 AW796059 W92358 
AL046458 AA471 198 AA301952 R46287 R82594 H03186 AA187706 R32562 R27094 R25947 R25320 AW949809 H 13505 
H79049 R32403 H11213 R39710 H49765 H21142 H21006 AA417664 W52075 N56771 AA284240 N98556 N30907 



162 



WO 02/079492 



PCTYUS02/04915 



AA707335 AW603781 AJ340387 AI814584 AA524182 AA370076 AM 18785 AA704082 AB08851 H25513 T55388 
AA419827 H03986 H209S3 T56245 A1459715 AW973768 A1334096 A1693020 T63414 R82646 AW167251 H55938 
AG74916 AA778367 AI755253 AI033567 AW083222 AA181979 R26865 AA661627 AA706329 AI79S548 AA612799 
A1160180 AT274973 A1039264 AA301880 A1042429 AA307632 AI085688 A1278366 AI498890 AA303865 AI954844 
5 AA5Q2380 AA156334 AA723480 A1803S84 A1581026 AA304584 N51038 R947Q2 R69814 AW150962 AI570049 AA588807 

AA151 198 T53400 A1567709 AJ185326 AA309205 AW33B969 R53903 AA991891 AA301643 AI493337 AI026049 H25514 
AI741075 R28632 AW166445 AI333088 H49978 H91267 AA558193 AW079663 AA627380 AA807401 AI199956 AA86S118 
AI718216 AW193228 A1077745 AJ500496 AI266059 AW080383 R06468 R26757 R32404 AA716599 W92322 AI077734 
AI270181 R46198 AI217540 AA304045 AA305421 AW074445 AI468256 AW089568 AW571605 BE162930 H41009 
1 0 AA578313 AW874497 AA181284 AA861947 T29451 D20841 T58618 AA418731 A1282500 AW081407 AA604560 

AA729855 A1262538 A1580225 

102915 2903.2 X07820 NM.002425 BE271570 AI263526 AW296143 AJ829878 A1973162 AI085155 AA857496 AA709305 C02220 

134416 30894J X682B4 NM.006500 AF089868 BE257461 BE275425 AW997154 AI902799 AB02803 M78206 AA085891 AW392972 

AA325490 BE006161 AA3492S9 AA323568 AL042548 AA191148 AA187703 AA322791 AJ297452 T11825 AW365487 

15 AA303513AA186961 AA1 73480 N28330 N28379 W40320 AA1871 18 H03695 AA402709 BE407476 H06354 BE276589 

AA351284 AA379921 AL138060 BE410587 AA113094 AA340481 BE277483 R21191 R79518 NB6170 AA320505 
AA296065 AW951900 AA658897 AA650052 AA654304 AA191691 N26649 AW080963 AI265800 N72019 A1453458 
AA092563 AA402310 AI439450 AI061054 AA302358 T71565 AA3Q2047 AA303432 N21289 H27357 AA303504 AI174583 
AW151762 AA181958 AW880618 AA630773 AI889539 AW901058 AI373405 AA341941 AA086217 A1575590 AI653936 

20 AA633570 AA987619 AI270656 N93847 N40689 AW517517 N20030 W95985 AA303955 H89170 AA309917 N21642 

AA373132 W38517 AIS87806 W76182 AA101065 AA036916 N45635 AI744510 A1669803 A1039157 AI126355 AA634607 
. AW131120 AW196838 AA190601 AA911130 BE221320 N92355 AA036752 H03696 AA5B8873 A1458868 AI041818 
AA090477 AI093248 AA304755 AL137942 AL044688 AI083709 AI150985 N88891 AA635675 AA594898 W94657 
AA182823 AW166205 F27886 R79246 F37329 AA565697 AI075739 AI088654 AI094287 A1204256 AA095203 T93020 

25 AA68838 AA057324 N23442AA075411 AA305046 AI031688 A1191 503 AA1 1 1887 AA1 12264 N27929AA1 87509 

A1375522 AW74006 H06297 AI826177 N48880 H28333 AA075490 R22809 W79542 AI055934 AA042901 AA173481 
AA301986 W74531 AI051747 AA187715 AI888888 AA993017 A1057530 T92954 N80227 AW273595AI351 260 AW1 70643 
AW292979 AA302605 AA302330 BE349495 AA328602 AA302361 AI470984 AA155943 AA155914 
105178 7792_1 AA313825 AW960347 AF223468 NM.016613 AA18S345 AA1B6508 AA081 195 AA147972 AA346943 AW961667 

3 0 AA1B7222 AA187207 AW371052 AW449751 AW748803 AW391606 AW371047 AW371057 AW371085 AW362895 

AW371092 AW377556 BE010930 AI016882 AA247878 C04398 C05158 F11398 AA188315 H23385 R55086 H15346 
AA029106 AA228114 H17005 F08498 Z43376 AA095582 AA055186 AA463361 R15218 AA299132 AW103578 W21538 
AA428131 AA187115 AA157197 AA157167 AW371371 AA363562 AW965995 N55663 Z17878 AA228023 AI140342 
AA100927 AA496988 AA055917 AI089303 AW014967 AW090248 AW338371 AW131 066 D62963 D79713 AI583950 

35 A1336781 AI500705 AI471485 AW090239 D79784 D61847 D62789 D61842 AI086327 AI273381 D61815 D63043 AI913548 

A1280560 AI510828 AA029996 C16343 C16513 AI075741 AW516308 AI804764 AA948068 A1356588 AW103452 
AW573063 Z39445 C16489 AI949870 F04712 AA147823 AW026284 AI151538 AA081303 AA613890 AI251865 AW086499 
AA992111 AI862091 AI373465 BE5Q2094 AI922270 AA884288 AA157079 N56963 AW189145 AA428080 R55056 
AA884068 AW771716 AA186562 C16364 H15723 AI921181 AA15688B H17006 AA187490 A1400994 AA346942 H28533 

40 AW129047 R41656 H14636 AA995041 D58370 221 131 D58186 A1383271 AA643977 D58044 A1934302 AW779425 

F09065 H14930 AA890693 H23274 

1 05263 1 78672^2 AW388633 A W378440 AW388283 AW388339 AW388333 AW3884 14 AW38841 3 AW388607 AW388453 AW388587 
AW388480 AW388591 AW388711 AW388511 AW3B8438 AW388570 AW388449 AI694383 AW237145 AI652991 
AI964041 AW366319 AW366321 AW961938 AW46921 1 AI634155 AM92186 AI624430 AI677965 N26502 AI963871 
45 . AW378431 AW378421 AI015391 AW352126 N59336 A1352317 AW197113 N67998 AW778935 AW76054 AI206626 

R371 16 R4021 1 AA227926 AA639698 R38073 A1001745 T32854 AI619649 AM23703 F10774 AW388615 T16595 H05894 

105330 182497.1 AW338625 R43226 R51640 A1307645 Al 308 100 AI085787 A1420357 AI692610 AA877160 A1953366 AA234743 

104754 90967J AI039243 R68234 AA025351 AA971063 A1537757 AA025382 R81636 T86650 

104865 102037.1 T79340AI742317AW182676AW451460 AM20964 R43284 AA088179AW590886 AVSQ69529 AA045187 AJ521736 

50 A1827455 AA045136 AW271709 AW04344 AA639631 AA744417 AA744218 AA045357 AA045351 

106124 54542 J H93366 AI653547 AA336265 AW966175 BE566451 R71178 AI630556 AA234331 N55039 AA305632 AW960431 R34044 

R32254 AW020970 AW451281 AW275041 A1636933 AI655640 AA423986 AA642466 AI684063 AI633876 AI624897 
AA814795 AW590328AI889166AW243541 AI439691 AW473445 AW75516 AA741228 AI127534AA1 65143 AI074714 
AI654076 AA400574 AI560249 N50709 AW438621 AI808810 A1434579 AJ308184 AA423987 A1141272 AI565586 

5 5 AI33B440 AA219628 AI246643 AS85809 AA724260 AA633988 AI364172 AI798439 AB50801 R33503 AW35891 

AA903649 T96161 AA665538 AA219620 AI309962 AA400707 BE247056 R32178 A1275962 AA661602 AW003197 
BE466649 AA831 198 AI620052 AI825387 AI634037 AI670978 AI670979 AI655092 R32304 AA828858 AI382428 
AW023660 AA262892 T26891 AW089917 T26926 R32227 
107385 6976.1 NM.005397 U97519 AWB99329 A1902387 AA077792 AA078525 AW376607 AA077946 AA070415 BE208721 AW167958 

60 BE293050 BE208240 AI648698 AA101314 BE393348 BE305122 AA077591 BE274036 AA313687 BE392220 BE378954 

AA171461 AA464821 AW938242 AW938224 AW938243 AW938232 AA147953 N54294 AA205218 AW305065 AW517478 
AA307983 AA377023 BE563629 R99976 N80294 T87719 T87928 AA496849 AA486344 AA204938 AW370448 AA318242 
AW964384 H92423 W95317 BE378774 BE391 156 AA349138 AA173095 AW513198 AA037672 AA148029 AA169726 
W04791 AA075508 BE382937 BE395034 AF139793 AA961734 N48612 H64714 AW151251 AI565113 AI566881 

65 AW087370 AA631168 AA622014 AW513098 AI857810 AW152287 A1052596 AI983246 AA024856 A1912456 AI677938 

AW026403 AA972537 AI088497 AW9998S9 W94582 AH 401 66 AI160659 AB66868 AA101263 AW190390 AW166466 
A1401207 AI418156 AI625265 AI146298 AW008592 BE223020 N58926 AJ308797 AA037673 AI935992 AI304706 
AA024939 AI216589 AI610423 Ai354621 AI500677 AI679389 AI799310 N64508 AJ128756 AI67^97 AW589535 
AA989333 AI500527 AA565479 AA913529 AI923295 F21691 AA989376 A1699064 AA902447 AI690910 AA7726S 

70 AA204983 A1337895 R99975 H652Q5 AA340766 AI339441 AI913855 AA450293 AW192010 AA070416 N72401 AI371481 

A1247108 AI371261 AI364987 AI280171 AI269104 AI868756 AA909836 AA983640 AS73271 AA913092 AI868205 
A11441 12 A1190975 N58085 AB66S38 N93405 AW150504 AW296846 A1687036 AA902984 AI824460 A1625047 AA653148 
AI61 1228 AW131922AA852687 AA902519 C01732 AW796045 AL044660 
101192 15367.1 BE247295AW068092AW1313M159244rfl^005415L20B59AL135570W470 BE408629 

75 W46972 BE293646 BE256647AI075010 AL041095 AA2853Q0 AL039560 AA368740 W26602 AA399344 AA039235 

W27631 AW834898 AW834914 R93390 AA378039 AV649660 T53674 N98824 AA399974 AW843378 AA368267 R08256 
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AV653575 RZ79Q0 N46215 AW366371 N45500 AV6529S7 AI889251 AI080457 N39021 A1738542 AW242849 A1857471 
A1859775 A1582830 R75850 N56564 AW341638 AI499006 AI887217 AW026894 AW1 82840 AA033313 AA831346 
A1393465 AW0S9210 A/743830 AA744243 AA401310 AW439758 AW088152 R93391 AA291379 AA225220 AW009358 
AI192879 AA291202 AI565089 AA225089 AAB07688 A1052058 AB41641 AI066625 AA333884 AA159147 A1923912 
5 R75851 AI761 143 AW768588 AA394195 A1288450 AW512564 A1452775 AI056520 AA468602 AA872566 AK34739 

AA291838 AI948623 AW768614 AI374753 AW068174 AA884908 Al 199346 AJ 1 99347 W94946 AI159995 AA877642 
AI280646 AB07610 AA403310 R08205 AW182123 AI000999 R27808 AW026571 D20816 A1560350 T27667 AW960271 
A1174628 AI432042 AM24528 AA909562 T17342 AI783866 
109001 146370J A1056548 AW409843 AW263540 AA723669 AA909334 AA156120 AA1 57141 AA156125 AW409866 W19499 AA157229 
10 AW887435 
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TABLE 6: 



10 



Unique Eos prabeset identifier number 

Exemplar Accession number, Genbank accession number 

Unigene number 



Pkey: 
ExAccn: 
UnigenelD: 
Unigene Title: 

AUC1 : 70* percentile of average intensity (Al) for probeset at each of 2,6,1 5,24,48, and 96 hour timepoints minus 70* percentile A) at 0 hrs, 

summed over 5 experiments. 

AUC2: AUCIW percentile ofAl for aorta, aortic valve, vein, and artery. 



Pkey Ex-Accn UnigenelD UnigeneTitle 



AUC1 AUC2 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 319962 
338033 
314943 
332640 
338158 

75 327036 



314941 
327414 
321911 
331578 
332466 
313513 
320635 
326230 
313556 
313665 
324852 
314372 
311877 
322262 
312173 
319795 
313350 
326759 
300318 
313978 
306840 
310272 
315044 
321325 
303251 
302378 
315060 
332048 
337214 
311598 
304782 
312802 
302680 
317452 
318558 
312149 
319267 
321510 
326198 
315730 
310442 
331237 
300469 
338316 
330968 
331019 
331261 
301822 
325544 
328700 
322882 
336034 
316580 
309931 



AA515902 Hs.130650 ESTs 1038 

predicted exon 303.2 

AF025944 Hs293797 ESTs 429.2 

AI246482 Hs249989 ESTs 677.4 

AB018259 Hs.118140 WAA071 6 gene product 395.2 

AW298600 Hs.141840 ESTs, Weakly similar to S59501 interfero 324 

N50617 Hs.60506 small nuclear ribonudeoprotein poiypept 394.8 

predicted exon 3572 

Hs.118502 433.6 

Hs.120932 ESTs -83 

Hs.135104 ESTs 3482 

Hs.142003 ESTs, Weakry similar to The WAA0149 gen -492 

Hs.85339 G protein-coupled receptor 39 -1309 

Hs.188746 ESTs -247.8 

Hs3Q4471 ESTs, H^hty similar to AF116865 1 hedge -10253 

Hs.146858 protocadherin 10 203.6 

Hs.57958 ETT. protein 1B3.8 

predicted exon 1654.4 

Hs256982 ESTs, HigWy similar to AF1 16865 1 hedge -346 

Hs.13957 ESTs 576.6 

Hs.307912 EST 56.4 

Hs.148932 semaphorinRs, short form -127.6 

Hs204169 ESTs -102.6 

Hs.300646 K1AA protein {similar to mouse paladin) 1080.6 

Hs.115897 protocadherin12 1270.8 

Hs.296506 Homo sapiens mRNA full length insert cDN 915.8 

Hs.189048 ESTs ( Mc<teraterys» 1236.8 

Hs201591 ESTs 522.6 



AA628517 
AW751201 
AI380792 
AL040178 
AA084248 
AA632012 
A1821409 
AB037821 
AW591949 

AW444502 
AI870175 
AI077477 
AF216389 
BE547674 
AB033100 
AF240635 
AL109712 
AA551104 
AW337575 

AW023595 

AA582081 

AA644669 

AW192334 

AA972965 

AW402677 

T90309 

F11802 

H75391 



predicted exon 269 
Hs.232048 ESTs 796.4 
gb:nn32h08.s1 NCi_CGAPJ3as1 Homo sapiens 



Hs.193042 
Hs.38218 
Hs.135568 
Hs.146381 
Hs269651 
Hs.6818 
Hs255748 



H25899 Hs.201591 
AW072215 Hs208470 
W87874 Hs.25277 
BE301708 Hs.233955 

R44557 Hs.23748 
NM.006033HS.65370 
BE539976 Hs.103305 
X17033 Hs271986 



AW248508 Hs.279727 

AA938198 Hs.146123 
AW341683 
R39288 Hs.6702 
H06350 Hs.135056 

Y00272 Hs.184572 
BE568452 Hs3101 



ESTs 349.6 

ESTs 638.6 

ESTs 360.8 

RNA binding motif protein, X chromosome 7002 

ESTs 2742 

ESTs 238.2 

ESTs 231.8 

predicted exon 581.6 

ESTs 281.6 

ESTs -213 

hypothetical protein RJ210S5 285 

hypometicalprotebnJ20401 26.6 

predicted exon 14942 

ESTs 9753 

lipase, endothelial 2012 
Homo sapiens mRNA; cONA DKFZp434B0425 (f 

integrin, alpha 2 (CD49B, alpha 2 subuni 3562 

predicted exon 1014.6 

predicted exon 627.4 

Homo sapiens cDNA FU 14035 fis, clone HE 843 

predicted exon 782.6 

hypothetical protein RJ12972 746.4 
gbiid13d0lJi1 Soares J4FLJTJ»C_S1 Homo s 

ESTs 137 
Human DNA sequence from done RP5-850E9 

predicted exon 5403 

ceMivBioncyde2 t G1toSandG2to 4943 

protein regulator of cytokinesis 1 -600 

predicted exon 3112 

predicted exon 3513 



9 

303 

424 

103 

393 

324 

393 

35.7 

12 

03 

343 

0.5 

0.2 

1 

1 

52 
18.4 
12 
1 

2.3 
0.4 
0 
0 

43 

5.3 

153 

4.9 

4.7 

26.9 

202 

316.4103 

73 

63.9 

36.1 

6.6 

7.5 

233 

232 

8.2 

9.7 

03 

0.5 

0.3 

34.7 

13 

05 

478.613 

1.7 

9.4 

62.7 

5.7 

783 

133 

1343133 
13.7 

143 03 
14 
1 
1 

31.1 
352 
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302655 AJ227892 Hs.146274 ESTs 1802 18 

327 55B predicted exon 229 223 

324801 AW77Q553 Hs.14553 sterol C^cyftransferase (acyVCoenzyme 1615 16.1 

317850 AI681545 Hs.1 52982 hypothefcal protein FU131 17 -690 1 

5 322818 AW043782 Hs293616 ESTs 126.4 4.5 

324626 A1685464 Hs292638 ESTs 1702 17 

317224 X73608 Hs.93029 sparc/osteonecttn, cwcv and kazaMlke d -60 0 

310955 AI476732 Hs263912 ESTs 466.8 46.7 

315240 R38772 Hs,172619 K1AA11 06 protein Z77 27.7 

10 338388 predicted exon 267.6 26.8 

338442 predicted exon 256 25.6 

318617 AW247252 Hs.75514 nucleoside phosphorylase 1247.8 24.2 

338645 predicted exon 206 20.6 

313135 N58907 Hs.162430 ESTs 204.8 20.5 

15 324716 BE169746 Hs.12504 hypothetical protein DKFZp761D081 203.6 20.4 

330305 predicted exon 199.8 20 

308248 AI560919 gb1q41g10.x1 NCI_CGAPJJt1 Homo sapiens 199.4 195 

308886 AI833240 gbat76d10jt1 Barstead colon HPLRB7 Homo 1982 19.8 

315622 AI795144 Hs258188 Homo sapiens cDNA FU 11 674 fts, done HE 1912 19.1 

20 323675 R43240 Hs2721 68 tumor differentially expressed 1 1892 16.9 

312164 T91980 Hs221074 ESTs 187.6 163 

300378 Z45270 Hs235673 hypothetical protein FU22672 271.6 18.7 

317478 AI343569 Hs.107000 Homo sapiens mRNA for WDC146, complete c 187 18.7 

317559 AW452344 Hs.129977 ESTs 1842 18.4 

25 317207 AI873346 Hs214505 ESTs 1823 183 

334834 predicted exon 1783 17.9 

320925 D62892 gb:HUM337CQ7B Clontech human aorta poIyA 177217.7 

303289 AL121460 Hs.272673 hypometica) protein FU20508 316.4 17.6 

328548 predicted exon 174.6 173 

30 317108 AA884000 Hs3173 hypothetical protein FU 10803 172.4 172 

318013 A1188183 Hs.144078 ESTs 326 17.2 

314299 AW382682 Hs.154840 ESTs 1703 17.1 

317702 AW173339 Hs.135665 ESTs 1693 17 

316094 AW975920 Hs283361 ESTs 169.4 16.9 

35 323706 AA377578 Hs.65234 hypomefcal protein RJ20596 1692 163 

325843 predicted exon 321.4 16.9 

316012 AA764950 Hs.119898 ESTs 10472 16.9 

309687 AW236154 Hs.77385 rrryc^ln.GgrrtporypeptideS^W.smooth 168.2 163 

323329 AL134744 Hs.10852 ESTs 168 163 

40 312853 W05086 Hs.114256 ESTs 167.4 16.7 

313070 AI422023 Hs.161338 ESTs 2983 163 

314096 AW977642 Hs.291742 ESTs 165.6 163 

338728 predicted exon 165.4 163 

316609 AW292520 Hs.122082 ESTs 165 163 

45 305989 AA888220 gb:oj15h01.s1 NCLCGAPJ<ld5 Homo sapiens 164.6 16.5 

312642 AW052128 gbwx26c02jc1 NCI_CGAP_KW1 1 Homosaplen 164 16.4 

339236 predicted exon 163.6 16.4 

317058 AI217713 Hs.147586 ESTs 1613 162 . 

311137 AW207582 Hs.196042 ESTs 582.2 162 

50 310178 AI936450 Hs.147482 ESTs 161.2 16.1 

32)745 K51696 Hs.89278 hypothefcal protein RJ1 1186 161 16.1 

317336 AW014637 Hs.130212 ESTs 160 16 

309871 AW300366 gbos63b05j(1 NCLCGAP_Wd11 Homosapfen 159.816 

302038 AC004076 Hs.1 29709 Homo sapiens chromosome 19, cosmtd R3021 159 153 

55 332237 N52883 Hs.102676 EST 159 153 

312362 AW015994 gb:Um^l0r>abrvg-()WWJU1 NCLCGAP.S 158.6 153 

331558 N62401 Hs.48531 EST 158.6 153 

316215 AI684535 Hs.200811 ESTs 158.4 153 

336059 predicted exon 157.4 15.7 

60 302790 AJ245245 gb:Homo sapiens mRNA for immunoglobulin 1553 15.6 

328418 predicted exon 1533 15.4 

304229 AK000149 Hs29493 hypothetical protein FU20142 153.6 15.4 

331606 AW273285 Hs.50802 ESTs 153 153 

338962 predicted exon 664.4 153 

65 317959 AI204202 Hs.130264 ESTs 152.6 153 

336228 predicted exon 152.4 152 

313534 AW072916 Hs.78743 zinc finger protein 131 {clone pHZ-10) 1522 152 

317404 AI806867 Hs.126594 ESTs 1522 152 

311943 AI469911 Hs26498 hypothetical protein FU21 657 152 152 

70 314680 AI247425 Hs.152182 ESTs 151.4 15.1 

331484 N29696 Hs.44076 EST 1512 15.1 

338116 predicted exon 1512 15.1 

38863 predicted exon 150.6 15.1 

315555 AW452886 Hs239107 ESTs 149.6 15 

75 317039 AA868583 Hs.126153 ESTs 149.6 15 

331138 R63816 Hs28445 ESTs 149.6 15 
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316561 A1917222 Hs.121655 ESTs 149.4 143 

328695 predicted exon 149.2 143 

302282 BE396283 Hs.173987 eukaryotic translation iniBaflon factor 148.4 14.8 

318781 F11802 Hs.6818 ESTs 148.2 14j8 

5 323709 AW297246 Hs.288546 Homo sapiens cONA FU14190 fls, clone NT 148 14* 

310790 AW192053 Hs248865 ESTs 147.8 14.8 

316833 AW292614 Hs.124367 ESTs 147.8 14.8 

323176 NM.007350HS.82101 plectetrin homobgy-Cke domain, family 229 14.8 

324188 AW274439 Hs252709 ESTs " 147.6 14.8 

10 317441 AA922798 Hs.196583 ESTs 147.4 14.7 

317584 A1825890 Hs.220513 ESTs 146.8 14.7 

321798 A1308206 Hs.181959 ESTs 146.8 14.7 

304363 AA206045 gfczq77fD5.s1 Stratagene hNT neuron (937 146.6 14.7 

313952 F20956 gb:HSPD05390 HM3 Homo sapiens cDNA done 146.614.7 

15 301909 AJ702609 Hs.15713 ESTs 263.8 14.7 

309196 AI904895 Hs.9614 nucteophosmm (nucleolar phosphopralein 1462 14.6 

321860 N47474 Hs212631 ESTs 1462 14.6 

330187 predicted exon 146 14.6 

323042 AA463571 Hs.172550 polypynmidine tract binding protein (he 145.6 14.6 

20 313638 AA262397 Hs201366 ESTs 145.2 145 

302437 AB024729 Hs227473 UDP^-acelylgjucosamtoe:^ 145 145 

318197 AJ473096 Hs.133403 ESTs 144.8 145 

302749 M16951 gfcHuman Ig mu-chain mRNA VDJ4-regton 1 5 144.6 145 

322357 AI734258 Hs245367 ESTs, Weakly similar to ALULHUMAN ALU S 144.6 14.5 

25 300391 AI927371 Hs.288839 hypoflietfcal protein FU12178 144.4 14.4 

326077 predicted exon 144.4 14.4 

302004 Y18264 Hs. 123094 sal (Drost>phfeHike 1 ' 144* 144 

320668 AA805666 Hs. 14621 7 Homo sapiens cONA: FU23077 fis, done L 144 14.4 

331212 TB8693 Hs226410 ESTs 144 14.4 

30 311268 AJ969727 Hs231859 ESTs 1432 145 

305159 AA659166 Hs275668 EST,WeaWysimilartoEF1D_HUMANELONGATIONF 143 145 

304510 AA457391 Hs.1 19122 ribosomalproteinL13a 142.8 145 

320852 AA772920 Hs.303527 ESTs 14Z8 145 

330854 AW291944 Hs.122139 ESTs 14Z8 145 

35 318275 AW449952 Hs.190125 basic-heBx-toop-herix-PAS proteh 14Z6 145 

314992 AI824879 Hs.2 11 286 ESTs, Weakly similar to 1207289A reverse 1422 142 

322631 AA001697 Hs.293565 ESTs, Weakly similar to putative p1 50 [H 1422 142 

332283 R40855 Hs.100839 EST 142 142 

302894 AA719572 Hs.274441 Homo sapiens mRNA; cDNA DKFZp434N011 (fr 1412 14.1 

40 301808 R35391 Hs.252831 retfcuton3 141 14.1 

318608 AI204491 Hs.151502 ESTs 141 14.1 

316499 AW292947 Hs.122872 ESTs 140.8 14.1 

317011 AI248760 Hs.150276 ESTs 140.8 14.1 

321840 N45600 Hs.46534 Homo sapiens mRNA; cDNA DKFZp434P0714 (f 140.814.1 

45 327365 predicted exon 1405 14.1 

331264 AA278898 Hs.225979 hypothetical protein similar to smaD G 140.8 14.1 

324545 AW501944 Hs.1 27243 Homo sapiens mRNA for K1AA1724 protein, 140.4 14 

312986 AA21 1586 gb:zn56d05.s1 Stratagene muscle 937209 H 1402 14 

316053 AA825814 Hs.149065 ESTs 1402 14 

50 330723 BE247449 Hs.31082 hypothetical protein FU10525 1402 14 

304876 AA595765 gb^J28g06^1 NCl_CGAP_M1 Homo sapiens 1395 14 

311379 AW134766 Hs.202450 ESTs 1395 14 

318265 AW019873 Hs.146840 ESTs 1395 14 

324137 AA393127 Hs222762 ESTs 139.8 14 

55 328262 predicted exon 139.6 14 

322349 AK001279 Hs.180171 Hoiriosap^scDNAFU10417fis,doneNr 139.4 135 

323504 AA280223 Hs.130865 ESTs 139.4 135 

304261 AA059387 gb:zf86d01.s1 Soares retina N2b4HR Homo 1392 13.9 

310489 AW451493 Hs235516 hypotheticaJ protein PR02955 1392 13.9 

60 335946 predicted exon 1392 13.9 

318155 AI041546 Hs.132133 ESTs 138.8 135 

313796 AI797169 Hs208486 ESTs 1385 135 

333977 predicted exon 138.6 135 

324845 AW969635 Hs283718 ESTs 1382 13.8 

65 331139 R65706 gb:yi16g1Zs1 Soares placenta Nb2HP Homo 1382 135 

331131 R54797 gb:yg87b07.s1 Soares infant brain 1NIB H 6695 135 

321250 H58539 Hs.151692 ESTs 138 135 

312498 AA668782 Hs.191284 ESTs, Weakly simflar to ALU1_flUMAN ALU S 1375135 

331252 W52470 Hs54578 aipha2,3-siar/Sransferase 1375 135 

70 337407 predicted exon 1375 13.8 

303973 AW512014 gb»68a03jc1 NCLCGAPJLym12Hornosapien 137.413.7 

314582 AA412258 Hs.188817 ESTs 137.4 13.7 

327373 predicted exon 1372 13.7 

323367 AA234591 Hs504123 ESTs 136.6 13.7 

75 3162)7 AA832065 Hs.120260 ESTs - 136.4 13.6 

315231 AA705809 Hs.119922 ESTs 1362 13.6 



167 



WO 02/079492 



PCTAJS02/04915 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



318592 


T39310 Hs.1 139 cold shock domain protein A 


136.2 


13.6 


320905 


AW969706 HsJB3332 ESTs 


136.2 


13.6 


328937 


predicted sxon 


1365 


13.6 


329073 


predicted exon 


136.2 


13.6 


318231 


AV659082 Hs.134228 ESTs 


136 


13.6 


311992 


AL360200 Hs 114145 ESTs 


135.8 


13.6 


316497 


AA766457 Hs.136849 ESTs 


1353 


133 


317677 


AA968594 Hs.127868 ESTs 


135.8 


133 


321680 


W02848 Hs.93704 ESTs 


135.8 


13.6 


326080 


predicted exon 


135.8 


13.6 


330938 


AF036943 Hs.172619 KIAA1 106 protein 


1353 


13.6 


306573 


AL134878 Hs.1 19500 ribosoma! protein, large P2 


135.6 


13.6 


307383 


AI223207 Hs.147888 EST " 


135.6 


13.6 


311114 


AW449382 Hs.195297 ESTs 


135.6 


133 


320579 


R15138 Hs.165570 Homo sapiens done 25052 mRNA sequence 135 


133 


301328 


AA884104 Hs.125546 ESTs 


134.8 


135 


312063 


N56198 Hs.162898 ESTs 


134.8 


133 


323036 


H09604 Hs.13268 ESTs 


134.6 


133 


332776 


AF241850 Hs.151428 ret finger protein 2 


134.4 


13.4 


332494 


AA282330 Ms.145668 ESTs 


134.2 


13.4 


334376 


predicted exon 


134.2 


134 


313264 


N93416 Hs.118228 ESTs 


133.6 


13.4 


313669 


AA351109 Hs.5437 Taxi (tiiiman T-cell leukemia vtrus type I 


133.2 


13.3 


312083 


T87398 Hs.205816 ESTs 


132.6 


133 


319354 


AA993807 Hs.167367 ESTs 


132.6 


133 


307414 


AI242106 gb:qh92a02jc1 Soares_NFL_T_GBC_S1 Homo s 


1325135 


312771 


AA018515 Hs.264482 Apg12 (autophagy 12. S. cerevisiaeHxe 


131.8 


135 


313004 


AI274963 Hs.145900 ESTs 


131.2 


13.1 


300995 


AW510641 Hs.258018 ESTs 


220.6 


13 


319323 


F12650 Hs.13287 ESTs 


125.4 


123 


329451 


predicted exon 


123.4 


123 


337603 


predicted exon 


572 


125 


312480 


R68651 Hs.144997 ESTs 


121.4 


12.1 


324934 


AW452051 Hs.147546 ESTs 


119.4 


113 


320723 


BE178025 Hs.7942 hypothetical protein FU20080 


117 


11.7 


318188 


AI792566 gb:qI74fD5.y5 NCLCGAP_Ov26 Homo sapi 


ens 


116.611.7 


320873 


AF238869 Hs583955 Homo sapiens done GLSH-2 similar to gfi 


112.8 


113 


331005 


BE003191 Hs.1 19555 ESTs 


112.6 


113 


304969 


AA614405 gb:np46(05.s1 NCI_CGAP_Br11 Homo sapiens 


1154115 


319799 


A1139253 Hs.227767 zinc finger protein 41 


1115 


11.1 


302610 


AA347945 Hs556024 ESTs 


111 


11.1 


309485 


AW130320 Hs.108124 ribosomalproteinS4 P X-Dnked 


111 


11.1 


311880 


AW419225 Hs.256247 ESTs 


1105 


11 


313981 


AW452334 Hs.128148 ESTs 


1105 


\\ 


322442 


W49701 Hs59667 ESTs 


109.4 


10.9 


315099 


AA806536 Hs.291841 ESTs 


109 


103 


304793 


AA583264 Hs.1 82979 ribosomalproteinLl2 


108.8 


103 


330815 


AA019211 Hs.235463 K1AA1238 protein 


108.8 


103 


304044 


T81656 Ks. 252259 noosoma) protein S3 


7143 


103 




predicted exon 


135 


10.8 


325889 


predicted exon 


814.6 


103 


321447 


AW891130 Hs.38173 ESTs 


107.8 


103 


302990 


AA496212 Hs.180182 ESTs 


1065 


10.6 


308106 


AM76803 gb±77e12j(1 Scares JJSF_F8_9W_OT_PA_P_S 


270.6 10.6 


310536 


A1301041 Hs.150174 ESTs 


106 


10.6 


315257 


AW157431 Hs.248941 ESTs 


233 


10.6 


318787 


Z42313 Hs52657 ESTs 


105.8 


10.6 


312306 


AI927226 Hs.175610 ESTs 


1055 


103 


325788 


predicted exon 


104.4 


10.4 


312234 


AA830840 Hs.206934 ESTs 


104 


10.4 


314482 


AW085525 Hs.134182 ESTs 


234 


10.4 


323597 


AI185693 Hs.135119 ESTs 


102.4 


105 


302623 


AW836724 Hs. 1941 10 hypothefical protein PRO2730 


162.4 


105 


323594 


AI791531 Hs.129993 ESTs 


101 


10.1 


324315 


N55761 Hs.194718 zinc finger protein 265 


1005 


10 


314217 


AA256465 Hs.188725 ESTs 


995 


93 


320932 


AA554913 Hs.162297 ESTs 


985 


93 


327876 


predicted exon 


98.2 


93 


319736 


R17424 Hs.6550 vacuolar protein sorting 45B (yeast homo 


98 


93 


327747 


predicted exon 


97.6 


9.6 


327844 


predicted exon 


97.4 


9.7 


318200 


AI061192 Hs.166517 ESTs 


975 


9.7 


329414 


predicted exon 


975 


9.7 


318296 


AI089667 Hs570713 ESTs 


121.4 


'.9.7 


307010 


AI140014 gbroa68©9jc1 Soa/es_fetaLheanLNbHH19W 295 


9.7 


319792 


AI138635 Hs52968 ESTs 


385.4 


9.6 
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305671 AA811588 Hs.82113 dUTPpyrophosphatase 38 9.6 

323440 predicted exon 93.8 9.4 

310381 A1263059 Hs.145594 ESTs 93.4 9.3 

318824 F05771 Hs.27226 ESTs 93.4 9.3 

5 328957 predicted exon 922 92 

318804 Z42549 Hs.160893 ESTs 92 92 

330838 AA055611 Hs226566 ESTs, Moderately similar to ALU4 JiUMAN A 92 92 

324592 AW752437 Hs.325708 ESTs 91.8 92 

311820 AW274545 Hs.254333 ESTs 91.4 9.1 

10 321614 H86161 gb.-ys94b01.r1 Scares retina N2b5HR Homo 91 9.1 

330306 predicted exon 91 9.1 

303096 AL080276 Hs.268562 regulator of G-protein signalling 17 90 9 

313275 A1027604 Hs.159650 ESTs 110.4 8.8 

302593 H54855 Hs.36958 ESTs 88 8.8 

15 321421 BE465115 Hs.171688 ESTs 862 8.6 

330832 AJ133530 Hs.62930 ESTs 456.4 8.6 

311847 AW301807 Hs.297260 ESTs 86 8.6 

322036 BE002723 Hs.301905 Homo sapiens cONA FL1 14080 Ms, ctone HE 145.8 8.6 

328688 predicted exon 85.6 . 8.6 

20 325251 predicted exon 85.4 8.5 

329088 predicted exon 85.4 8.5 

322524 W79027 Hs.271762 ESTs 84 8.4 

337953 predicted exon 451 8.3 

323529 AA284397 Hs201485 Homo sapiens done FLCQ664 PR02866 mRNA, 82.6 8.3 

25 307041 A1144243 gb:qb85b12.x1 SoaresJetaLhearLNbHH19W 306.882 

318285 A1332454 Hs.158412 ESTs 81.4 8.1 

312021 AA759263 Hs.14041 ESTs 81 8.1 

329350 predicted exon 81 8.1 

326169 predicted exon 80.4 8 

30 338038 predicted exon 10242 7.9 

312549 AI214510 Hs.146304 ESTs 77.4 7.7 

312542 D60076 gfoHUM084E1QA Ctontech human fetal brain 765 7.7 

320992 AB026891 Hs.225972 solute carrier famDy 7, (cationic amino 76 7.6 

318596 AI470235 Hs.172698 EST 150.6 7.5 

35 315650 AA649042 Hs269615 ESTs 73.4 7.3 

324328 AA447276 Hs292020 ESTs 2104 7.1 

332622 R10674 Hs. 128856 CSR1 protein 702 7 

328229 predicted exon 69.4 6.9 

319110 T75260 Hs.98321 hypothetica) proteki FU14103 68.6 65 

40 316133 A1187742 Hs.125562 ESTs 308.6 6.9 

303992 AW515800 gb:hrJ88g01.x1 NCLCGAPJ5C6 Homo sapiens 675 6.8 

322675 AA017656 Ha 146580 enolase 2, (gamma, neuronal) 3772 6.7 

325753 predicted exon 1052 6.6 

312539 AKKM377 Hs200360 Homo sapiens cDNA FU13027 fe, done NT 922 6.4 

45 302592 AA294921 Hs250811 v-ral simian leukemia viral oncogene horn 361.6 6.3 

314578 AA410183 Hs.137475 ESTs 201.6 6.1 

335986 predicted exon 108.6 6 

321478 AW402593 Hs.123253 hypothetical prated FU22009 528 ■ 6 

305192 AA666019 gb:ag44a04.s1 Jia bone marrow stroma Horn 58.6 5.9 

50 304275 AA070605 gb:zm53h09,s1 Stratagene fbrobtast (937 78.6 5.6 

302779 AJ235667 gbrHomo sapiens mRNA for immunoglobulin 278.8 5.5 

301976 T97905 Hs.77256 enhancer of zeste (DrosophHa) homolog 2 4792 5.4 

316021 AW293399 Hs.144904 nuclear receptor co-repressor 1 792.4 5.3 

320802 BE336699 Hs.185055 BENE protein 2423.8 53 

55 317282 AI733112 Hs.176101 ESTs 5232 5.1 

316827 A1380429 Hs.172445 ESTs 578 5.1 

303190 BE280787 Ha.16079 hypometka! protein RJ10233 223 5.1 

315587 A1268399 Hs.140489 ESTs 1362 5 

333122 predicted exon 399 5 

60 310214 A1220072 Hs.165893 ESTs 234.4 4.9 

320089 D43945 Hs. 11 3274 fjariscripfion factor EC 68 4.9 

309328 AW024348 Hs.233191 EST, WeaWy similar to A27217 glucose tr 258.8 4.8 

31B971 Z44067 Hs.10957 ESTs 376.6 4.8 

327220 predicted exon 47.4 4.7 

65 315757 AW014605 Hs.179872 ESTs 177.4 4.7 

320730 R68869 Hs.151072 ESTs 2052 4.6 

313339 A1662536 Hs. 163495 Homo sapiens cDNA FU 13608 hs, clone PL 260 4.5 

318634 T49598 Hs.156832 ESTs 4752 AS 

320955 AW820035 Hs.278579 a ajslntegrin and metaHoprotainase doma 388.6 4.4 

70 306605 AI000497 Hs.119500 ribosonaipfotein.largef^ ,81.6 4.4 

309349 AW051913 gb:wx24a09 jc1 NCLCGAPJQdl 1 Homo sapien 102.4 4.3 

306004 AA889992 Hs.2186 fwkaryoBdransi^^ 4512 42 

330020 predicted exon 612 4.1 

302308 AW327279 Hs.91379 ribosomal protein 126 342 3.9 

75 314648 AW979268 gbf ST391378 MAGE resequences, MAGP Homo 56.4 3.8 

315131 AI753709 Hs.152484 ESTs 130.4 3.7 
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313690 


a 1497591 H&78146 nlatetet/endnftefel ceD. adheskm motec 


3179£ 


3.6 


333585 


nrpfltrif^n Pino 


175.4 


3.5 


312911 


H93366 Hs.7567 Homo sanlens cDNA: FU21962 fis. done H 


219 


33 


322966 


AAB33BS9 Hs. 935090 Homo sariens cell recoaniSon molecule C 


350.2 


3.4 


312492 


071 079 Hs. 101980 FSTs 


322-8 


3 


31BS88 


Z442D3 HS.2641B ESTs 


25 


15 


332363 


AM 97705 Hs 108932 ESTs 


773.4 


15 


324181 


AI09547B He 131898 ESTs 


634.6 


14 


311717 


AW205369 Hs.312830 ESTs 


542 


14 


321342 


AA1 27984 Ks.222024 transaiDtkin factor BMAL2 


23.4 


13 


308852 


AIR90848 Hs 189037 npnfifMnroMfeampra<»^ffvcionhifinAl 


92 


23 


331466 


AA373210 Hs 43047 Homo saoiens cDNA FU13585 fe done PL 


494 


23 


320279 


A6033062 Hs 134970 DKFZP434N178 oroteb 


76.2 


22 


322221 


N 94 938 Hs 170889 nudeosoms assembly nratein 1-IBce 1 


2532 


11 




AL1 37449 Hs 198888 homeoboxB4 


136.6 


11 


331384 


AB041035 Hs.93847 NADPH oxidase 4 


720 


13 


300938 


AA514418 Hs 159790 FSTq Wealdv sfmflar to 1605244 A eivfhra 


27 


13 


J 1 lOM 


AW108R83 Hs 900949 FSTs 


303.8 


1.6 




W75179 Hs 987449 F9Ts 


189 


13 


332743 


AW947077 Hs 87595 transferase of inner mitochondrial membr 


14.4 


1.4 


771 030 


AW378885 Hs 18695 Mitrv^hnnn'riql ArvU^oA Thinesterase 

MVV J/ OO03 no. IOO£V IvUUAmIUIIUIICU MUjrVAJf* I1UUC3LCICOC 


529.8 


1.4 


333123 


predicted exon 


3952 


1.4 


328455 


predicted exon 


913 


1.3 


334458 


predicted exon 


406.4 


13 


313478 


AA543008 Hs.192775 ESTs 


413.4 


1.1 




AW338564 Ks.217493 annexinA2 


-30,8 




311775 

Jill M 


AW294416 Hs.144687 Homo sapiens cDNA FU12981 fe, done NT 


-62.8 ' 


1 




NM.001992HS.128087 coagulation factor II (thrombin) recepto 


-73.6 


1 




AW367295 Hs241175 ESTs 


43.8 


1 


7179Q1 


AI267970 Hs.150614 ESTs, WeaWysMar to ALU4JWMAN ALUS 


-63 1 


71C050 


AW275110 Hs2711Q6 ESTs 


-67 


1 


799984 


AI792140 Hs.49265 ESTs 


-3952 


1 


799ZK0 


AL121278 Hs25144 ESTs 


-1.6 




79AflA7 


AW975183 Hs292663 ESTs 


4.4 




771 405 


AW970939 Hs291039 ESTs 


-2823 




777fiin 


predicted exon 


-1516 


1 


335093 


predicted exon 


-23.2 




339403 


predicted exon 


-3312 


1 


709890 


X04588 Ks35844 neurotrophic tyrosine kinase, receptor, 


5912 


\ 


709970 


R56151 Hs.93589 Homo sapiens mRNA; cDNA DKFZp564B1 162 (f 


27631 


323755 


AW300094 Hs.136252 ESTs 


135 


0.9 




predicted exon 


727.4 


0.9 


315343 


BE144306 Hs.179891 ESTs, WeaWy similar to P4HA..HUMAN PR0LY 


12230.9 


711188 


AK001270 Hs.196086 hypotheficai protein RJ10408 


304 


0.9 


790779 


predicted exon 


109.2 


0.9 


791A15 


BE621807 Hs.3337 transmembrane 4 superfamBy member 1 


414.8 


07 


7771 9i 
0 JO lit 


predicted exon 


873 


0.7 


777190 


predicted exon 


379.8 


0.7 


770709 


AW797956 Hs.75748 proteasome (prosome, macropain) suburut, 


5892 . 


0.7 


314711 

0 IHf 1 1 


AA769365 Hs.126058 ESTs 


-87 


0.6 


7708JK 


BE409857 Hs.69499 hypothetical protein 


347.4 


0.6 


333169 


predicted exon 


-1182 


0.6 


775005 


predicted exon 


106.4 


0.6 


335815 


predicted exon 


-156 


0.6 


330232 


predicted exon 


102.6 


0.6 


770897 


AA031565 Hs.221255 ESTs, Moderately similar to AlU5_HUMAN A -62 


05 


331704 


F04225 . Hs.66032 ESTs 


-14.6 


03 


709849 


NM_016428Hs.130719 NESH prote&i 


267.6 


03 


704484 


AA432067 Hs.258373 ESTs 


85 


03 


710970 


AKD0O377 Hs.144840 homotog of mouse C2PA 


-70 


04 


OU iOO I 


AK7774S2 Hs.134084 ESTs 


-195.4 


0.4 


70fi777 


AA954221 Hs.73742 ribc^omalprotein.large.PO 


-33.4 


0.4 


771797 


N46436 Hs.109221 ESTs 


-392 


0.4 


779Q81 


predicted exon 


-5.6 


0.4 


322796 


W31178 Hs.154140 Homo sapiens ovary-sp^crfic acidic prote 


-880.6 


03 


79R857 


predicted exon 


552 


03 


718749 
0 IDOn£ 


AA743935 Hs.202329 ESTs 


43.4 


03 


331263 


AW780192 Hs.267596 ESTs 


-180.4 


03 


335987 


predicted exon 


-134 


03 


311923 


T60843 Hs.169679 ESTs 


122 


03 


310522 


AW134529 Hs^44647 ESTs 


-1673 


03 


315363 


AA759190 Hs.121454 ESTs, WeaJdy simflar to olfactory recept 


80 


03 


302032 


NNL001992HS.128087 coagulation factor II (thrombin) recepto 


-877 


03 


313140 


BE265133 Hs.217493 armexinA2 


95.4 


0.3 


310860 


AW015920 Hs.161359 ESTs 


-239 


03 


317899 


AI952430 Hs.150614 ESTs, Weakly similar to ALU4_HUMAN ALU 


S 


-7152 



03 
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328520 predicted exon -109.2 02 

302408 NMJ)12099Hs21 1956 CD^epsaon^ssociated protein; anfisens 10 02 

31 1804 AI866921 Hs2Q3349 Homo sapiens cDNA RJ12149 8s, clone MA -252.6 02 

315065 AKD01122 Hs.1 05859 hypoflwfcal protein RJ10260 46.2 0.2 

5 314129 AA228366 Hs.1 15122 ESTs -308.8 02 

335697 predicted exon 47.2 02 

335989 predicted exon 89 02 
320606 AW867943 Hs.127216 hypofoefcal protein FU13465 -205.6 02 
329745 predicted exon 103 0.2 

10 313628 AW419069 Hs209670 ESTs -177.8 0.2 

334616 predicted exon -936.6 02 

308820 A1821267 Hs207243 EST -7.2 02 

320416 AI026984 Hs293662 ESTs -18.4 0.2 

335211 predicted exon -142 02 

15 323629 AA375957 Hs.6682 ESTs -100 0.1 

331420 AW452904 gMJW«l3*y^11.04JJ.s1 NCLCGAP.Su 83 0.1 

315984 AI015862 Hs.131793 ESTs -250.6 0.1 

332833 predicted exon -3742 0.1 

332607 NM.002314HS.36566 UM domain kinase 1 -27.6 0.1 

20 313467 AA004879 Hs.187820 ESTs -2882 0.1 

323333 AV651680 Hs208558 ESTs -735.6 0.1 

330775 AW247020 Hs250747 SUMO-1 acfivating enzyme subuntt 1 53.6 0.1 

333168 predicted exon -1041.6 0.1 

332079 A1308876 Hs.103849 ESTs 19.4 0.1 

25 322724 AF161442 Hs.191591 Homo sapiens HSPC324 mRNA, paroal cds -123.6 0.1 

303652 AI799111 Hs.64341 ESTs 46.4 0.1 

303131 AW081061 Hs.103180 DC2 protein -156.4 0.1 

32)716 AI479439 Hs.171532 ESTs -146.6 0.1 

300454 AA659037 Hs.163780 ESTs -304 0.1 

30 312757 AI285970 Hs.183817 ESTs -445 0.1 

312391 R43707 Hs.133159 ESTs, Weakly similar to PIHUSD sarrvary -111JB 0.1 

308877 AI832519 gb:at69h03.x1 Barstead colon HPLRB7 Homo -149.6 0 

311275 AI659166 Hs207144 ESTs -62.6 0 

302363 AW163799 Hs.198365 2,3-btsphosphoglyceratB mutase -15 0 

35 321717 AW956580 Hs.42699 ESTs -1059.6 0 

302638 AA463798 Hs.102696 MCT-1 protein -332.2 0 

306352 AA961367 gb:or52a05.s1 NCI_CGAP_GC3 Homo sapiens 21.8 0 

313798 AI292146 Hs.71622 SWi/SNF related, matrix associated, acfl -972 0 

320807 AA135370 Hs. 188536 Homo sapiens cDNA: FU21635 fe, done C -2222 0 

40 320931 AW262836 Hs.252844 ESTs -881.6 0 

332450 AW288085 Hs.1 1156 riypofoefcaJ protein 28.4 0 

332535 AF 167706 Hs.19280 cystetr&fch motor neuron 1 -722 0 

335990 predicted exon 421 0 
330746 AB033888 Hs.8619 SRY (sex determining region YJ-box 18 35.4 0 

45 316820 AI627912 Hs.1 30783 Forssman synthetase -373.6 0 

337429 predicted exon -257 0 

331192 BE622021 Hs.152571 ESTs, Highly simQar to IGF-ll mRNA-bind -33 0 

330609 A1346201 Hs.76116 ubiquilin carboxyl-termina] esterase L1 -280 0 

323593 AI739435 Hs39168 ESTs -3627.6 0 

50 302704 AA531133 Hs.4253 hypomettea! protein MGC2574 -278.6 0 

330534 NM_004579Hs.82979 nritogerwcSvating protein kinase kinase -244 0 

332374 X91195 Hs.100623 phospholipase C, beta 3, neighbor pseudo -12042 0 

333221 predicted exon -189.6 0 

335988 predicted exon -122.6 0 

55 330574 A1984144 Hs.66713 hepatitis delta antigen-interacting prot -2257.4 0 

312052 BE621697 Hs.14317 nucleolar protein famSy A, member 3 (H/ -3592 0 

319568 AF1317B1 Hs.84753 nypothefcal protein FU12442 * -874.6 0 

337113 predicted exon -24.6 0 

335149 predicted exon -191.8 0 
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TABLE 6A 

Table 6A shows foe accession numbers for those pkeys lacking unigenelD's for Table 6. The pkeys in Table 7 tacking unigenelD's are represented within 
Tables 1-6A. For each pro beset we have listed the gene cluster number from which the oGgonudeofides were designed. Gene dusters were compiled 
using sequences derived from Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity using Clustering and 
AEgnment Tools (DoubleTwist Oakland Cafifomia). The Genbank accession numbers for sequences comprising each duster are listed in the "Accession" 
column. 

Pkey Unique Eos probeset identifier number 

CAT number Gene duster number 
Accession: Genbank accession numbers 

Pkey CAT Number Accession 



32)925 1525201J D62892 D79755 D62760 

20 321614 87866.1 H86161 AA054308 AA018955 

313952 136885J F20956 AA129374 AA133740 AW819878 

314648 293660J AW979268 AA878419 AA431342 AA431628 

302749 458.107 M16951 M16952 M16948 M16949 M16950 

312362 764066 J AW015994 R39898 AW000978 A1598202 AB21706 

25 312542 1522649J D60076 D60259 D81037 

312642 1005225.1 AW052128 H51439 H51481 

312986 171879.1 AA211586F35799 AA211641 F29720 AW937387 AW937408 

329350 cjchs 

329414 c_y_hs 

30 329440 c_y_hs 

329451 c_y_hs 

338033 CH22_6528FGJUNK^EMAC00 

338038 CH2^6535FG_UNK3*AC00 

338116 CH22_6650FG_UNK_EMACO0 

35 338158 CH2^6700FG_UNK.EMACOO 

329732 c14j)2 

329745 c14j>2 

308106 AI476803 

329863 c14.p2 

40 338316 CH2^6944FG_UNK-EM^C00 

308248 AI560919 

338388 CH22^7034FG_UNK_EMACOO 

338442 CH2^71MFG_UNICEMAC0O 

338645 CH22L7410FG_UNK.EWLAC(K) 

45 338728 CH22.7527FG_UNieEM^C00 

308877 AI832519 

338962 CH22.7838FG_UNK.DJ32I10 

308886 A1833240 

333120 CH22_349FG_81_3_UNreEMA 

50 333121 CH22_350FG_81.4_UNK^EMA 

333122 CH22_351FG_81_6J.INK.EMA 

333123 CH22.352FG.81.7_UNK.EMA 

333168 CH22_4MFG_94_1JJNK_EMA 

333169 CH22_401FG.94_2_UrKEMA 
55 333221 CH2Zj458FGJ05_1JJNK_Bfc 

326077 c17_hs 

326080 c17_hs 

326169 c17_hs 

326198 c17_hs 

60 326230 c17_hs 

333585 <>I22_846FG_203_4_LINK_EM: 

333610 Cr^871FG_217_5_UNK_EM: 

335093 CH22J423FG_492_3JJNK_EM 

335095 CH22J425FGJ92_5_UNK_EM 

65 335149 CH22_2484FGJ99_5.UN}eEM 

326759 c20_hs 

333977 CH22.1254FG_309_6.UNK^EM 

326788 c20 hs 

335211 CH22J550FG.51 1 _2JJNK_EM 

70 305192 AA666019 

303973 AW512014 

303992 AW515800 

326946 c21.hs 

328229 c_6_hs 

75 328262 c_6_hs 



172 



WO 02/079492 



PCTYUS02/04915 



328418 c_7_hs 
328455 c_7_hs 

335697 CH22.3058FG_596J2JJN}eE 
328520 c_7Jhs 
5 328548 c_7_hs 

335815 CH2a.3187FG.618.3JJNK.EM 
328688 cJTJus 
328695 C_7_hs 
307010 AI140014 
10 337113 CH2a.5058FG.493.1_ 
307041 AJ144243 
328700 c.7 Jis 

335946 CH22_3324FG_646_20_UNieP 
335985 CH22.3366FG_654_10JJNKJ) 
15 335987 CH22JJ367FG_654_11.UNK_D 

335988 CH22.3368FG.654.12J-NK.D 

335989 CH22.3369FG.655J2_UNK.DJ 

335990 CH22_3370FG_655_4_UNKJXJ 
337214 CH22_5288FG_613_7. 

20 330020 c16_p2 

305989 AA888220 

328857 c_7Jis 

328937 C.8 Jis 

328957 c_8_hs 
25 330187 cj _p2 

337407 CH22.5607FG_755.1_ 

337429 CH22.5633FG.762.3_ 

330232 c.5 J& 

307414 A1242106 
30 330305 c.7_p2 

330306 c_7_p2 

337603 CH22_5896FG__LINK_C20H12. 
337953 CH22.6395FG__UNK.EIAAC00 
339236 CH22_8181FG_JJNK.BA354I1 
35 339403 CH22_8384FG_LINK.BA232E1 
309349 AW051913 
325222 clOJis 
325251 c10_hs 

316188 956161.1 AI792566 AK)53836Ai054127 AI792489 A1288324 
40 309871 AW300366 
325544 d*Jns 
309931 AW341683 

332833 CH22.50FG.17.7.UNK.C2QH1 

302779 33837 1 AJ235667 AJ235666 AJ235664 AJ235665 AJ235668 AJ235669 AJ235670 
45 302790 34168 J AJ245245 AJ245247 AJ245257 AJ245248 AJ245254 AJ245256 AJ245253 AJ245203 AJ245250 AJ245252 AJ245243 AJ245204 
AJ245201 AJ245206 AJ245246 AJ245255 AJ245205 AJ245202 AJ245251 AJ245249 AJ245207 AJ245244 

332961 CH22.185FG.48J8_UNK.EM: 

325753 c14_hs 

327036 C21.hs 
50 325843 c16_hs 

325889 c16_hs 

304261 AA059387 

304275 AA070605 

334376 CH22.1670FG_379_8JJNK.EM 
55 327220 c.1_hs 

304363 AA206045 

334458 CH22.1757FG.391_2_UNK.EM 
327365 c_1_hs 
327373 cJLte 
60 334616 CH22J923FG_411_15_LINK_E 
327414 cJJb 
327568 c_3Jis 

336034 CH22_3419FG_678_5_UNK_DJ 
336059 CH22.3445FG.684_2.UNK.DJ 
65 334834 CH22^148FG_439.3_UNK_EM 
304782 AA582081 
304876 AA595765 
327747 c_5_hs 

336228 CH22.3626FG_730_4_UNK.DA 
70 329073 <uO» 

329088 c_x_hs 

304969 AA614406 

327844 c.5_hs 

327876 c.6_hs 
75 306352 AA961367 

331131 genbanK_R54797 R54797 
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331420 675963,1 AW452904 AW449414 BE467905 AJ298565 BE549332 BE326357 F04362 
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TABLE 6B 

Table 68 shows the genomic positioning for those pkeys lacking unlgene ID'S and accession numbers In Table 6. The pkeys in Table 7 lacking 
unigenelD's are represented win Tables 1-6B. For each predicted exon, we have listed the genomic sequence source used for prediction. Nucleotide 
locations of each predicted exon are also feted. 



Pkey Unique number corresponding to an Eos probeset 

Ref. Sequence source. The 7 digit numbers In this column are Genbank Idenfifier (Gl) numbers. "Dunham L et ai" refers to the pubScafion 

entitled The DNA 

sequence of human chromosome 22." Dunham I. et a!., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted 

NLposSion: Indicates nucleotide positions of predicted exons. 



Pkey 


Ref 


Strand 


NLposition 


332961 


Dunham, 1. etal 


Plus 


2521424-2521555 


333221 


Dunham, 1. etal. 


Plus 


3978070-3978187 


333585 


Dunham, 1. etaL 


Plus 


6234778-6234894 


333610 


Dunham, 1. etal. 


Plus 


6547007-6547116 


334376 


Dunham, 1. etal 


Plus 


13902218-13902331 


334458 


Dunham, 1. etaL 


Plus 


14353496-14353572 


334616 


Dunham, L etaL 


Plus 


15176123-15176470 


335149 


Dunham, 1. etal. 


Plus 


21497441-21497587 


335211 


Dunham, 1. etal. 


Plus 


21774611-21774680 


335697 


Dunham, 1. etal 


Pius 


25481456-25481649 


335986 


Dunham, L etal 


Plus 


27967791-27967852 


335987 


Dunham, I etal 


Plus 


27971413-27971481 


335988 


Dunham, 1. etal 


Plus 


27977912-27978013 


335989 


Dunham, 1. etaL 


Plus 


27983788-27983860 


335990 


Dunham, I etaL 


Plus 


27988532-27988608 


336034 


Dunham, 1. etal. 


Plus 


29014404-29014590 


337953 


Dunham, 1. etal. 


Phis 


68270294827125 


338033 


Dunham, 1. etaL 


Plus 


8092128-8092271 


338038 


Dunham,!, etal 


Plus 


81382194138392 


338316 


Dunham, I etal 


Plus 


17089711-17089988 


338442 


Dunham, 1. etal 


Plus 


19980640-19980698 


338962 


Dunham, 1. etal 


Plus 


29581892-29582020 


332833 


Dunham, 1. etaL 


Minus 


1119848-1119705 


333120 


Dunham, 1. etal 


Minus 


3307508-3307427 


333121 


Dunham, L etal 


Minus 


3308446-3308358 


333122 


Dunham, 1. etal. 


Minus 


33095964309531 


333123 


Dunham, 1. etal 


Minus 


33108174310749 


333168 


Dunham, 1. etal 


Minus 


37298964729788 


3331© 


Dunham, L etal 


Minus 


37308644730767 


333977 


Dunham, 1. etal 


Minus 


87229284722725 


334834 


Dunham, 1. eta). 


Minus 


17182681-17182535 


335093 


Dunham, 1. etal. 


Minus 


21297367-21297214 


335095 


Dunham, 1. etal 


Minus 


21292546-21292381 


335815 


Dunham, 1. etal 


Minus 


26320518-26320421 


335946 


Dunham, 1. etaL 


Minus 


27487203-27487035 


336059 


Dunham, l.etal 


Minus 


29184079-29183969 


336228 


Dunham, Letal 


Minus 


3090460240904497 


337113 


Dunham, I. etaL 


Minus 


21233344-21233237 


337214 


Dunham, Letal 


Minus 


26095902-26095502 


337407 


Dunham, Letal 


Minus 


3188665241886567 


337429 


Dunham, Letal 


Minus 


3208623842086079 


337603 


Dunham, l.etal 


Minus 


1299296-1299194 


338116 


Dunham, Letal 


Minus 


10614071-10613814 


338158 


Dunham, I. etal 


Minus 


11794465-11794343 


338388 


Dunham, 1. etal. 


Minus 


18662403-18662305 


338645 


Dunham, I. etal 


Minus 


2406383^-24063775 


338728 


Dunham, Letal. 


Minus 


25949039-25948927 


339236 


Dunham,!, etal 


Minus 


3277335542773202 


339403 


Dunham, 1. etaL 


Minus 


3405072844050625 


325222 


6525287 


Minus 


22332-22473 


325251 


6682448 


Minus 


411693-411751 


325544 


6682452 


Plus 


171228-171286 


325753 


6682474 


Plus 


398512498621 


329745 


6065779 


Plus 


174774-175142 


329732 


6065763 


Plus 


161252-161322 


329863 


6691797 


Plus 


196801-196971 


325889 


5867087 


Plus 


223829-223891 
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325843 


6552453 


Minus 


7126-7232 


330020 


6671887 


Plus 


172397-172491 


326198 


5867215 


Minus 


80295-60674 


326230 


5867230 


Minus 


301668-301972 


326169 


5867255 


Minus 


128321-128388 


326077 


6682495 


Minus 


312108-312168 


326080 


6682495 


Plus 


478644-478847 


326759 


6249610 


Plus 


97216-97311 


326788 


6682503 


Plus 


277132-277335 


326946 


6004446 


Minus 


116677-116967 


327036 


6531965 


Plus 


319951-320040 


327220 


5867525 


Minus 


65701-65781 


327365 


6552412 


Minus 


118133-118198 


327414 


QOOr f OW 


Plus 


102461-102586 


Ml u/ J 


OOOf / 3£ 


Minus 


8186-8742 


327568 


OOO/OI 1 


Minus 


46152-46287 


330187 

JsJ\J 10/ 


6706138 

Of VO JOO 


r turn 


212923-213020 


327747 


5867947 


Plus 


115322-115498 


327844 


6249582 


Minus 


18895-18958 

1 VUvv IUt/vV 


330232 




Plus 


113655-113830 


328229 


5868105 


Minus 


120936-121053 


327876 


5868140 


Plus 


103882-104034 


328262 


6381906 


Phis 


11867-12027 


328688 


5868262 


Plus 


626030-626094 


328700 


5868264 


Plus 


764089-764203 


328695 


5868264 


Pius 


318632-318695 


328418 




Minus 


258811-258894 


328455 


5868431 


Plus 


385576-3*5633 

www) ITwVwUiM 




□OOOn/ 1 


PflKt 


1942075-1942246 


32R54R 


CQCD/D7 


PtiK 
rluo 


72301-72397 


328857 


OOO I at/ 


Mmilfi 


80557-81051 




/R77QQ0 
HO/ /9M 


Mmirc 

IVkJIUd 


52769-52365 


330306 


4377382 


Plus 


96161-96233 


328937 


5868500 


Minus. 


1448241-1448333 


328957 


6456773 


Plus 


219195-219297 


329073 


5868596 


Pius 


37838-37956 


329088 


5868608 


Plus 


116738-116950 


329350 


6456785 


Plus 


98911-98969 


329414 


5868874 


Pius 


942555-942643 


329440 


5868885 


Plus 


21943-22063 


329451 


5868887 


Plus 


25974-26048 



176 



WO 02/079492 



PCTYUS02/04915 



TABLE 7: 

Table 7 depicts Seq ID No., Unigene©, UnigeneTrtle, Pkey, and ExAccn for aD of the sequences in Table 8. Seq ID No links the nucteic add and protein 
sequence information inTable 6 to Table 7. 



Pkey: 


Unique Eos probeset identifier number 




ExAccn: 


Exemplar Accession number, Genbank accession number 




UnigenelD: 


Unigene number 






Unigene TBIe: Unigene < 


jenetrfe 






SeqJD.Noj Set 


luence Identification Number found In Table 8 




PKey 


ExAccn 


Unigene ID 


Unigene TUtte 


SEQ ID NO 


101545 


BE246154 


Hs.154210 


endothefial difTerenfiation, sphingoOpl 


Seq ID1&2 


115819 


AA486620 


Hs.41135 


endomucin-2 


Seq ID3&4 


424503 


NM.002205 


Hs.149609 


Entegrin, alpha 5 (fibronecfin receptor, 


SeqlD5&6 


102917 


AI016712 


Hs.287797 


integral, beta 1 (fibronecfin receptor, 


Seq ID7&8 


102915 


X07820 


Hs^258 


matrix metaHoproteinase 10 (stromelysin 


SeqlD9&10 


105330 


AW338625 


Hs.22120 


ESTs 


Seq ID 11 & 12 


107385 


NM_005397 


Hs.16426 


podocaryxin43ce 


Seq ID 13 & 14 


102024 


AA301867 


Hs76224 


EGF-containing fibutin-like extraceHuIa 


SeqD15&16 


102024 


AA301B57 


HS76224 


EGF-corrtaming fibutin-like extraceDula 


Seq ID 17 & 18 


134416 


X68264 


Hs.211579 


melanoma ceD adhesion molecule 


Seq ID 19&20 


103038 


M13509 


Hs.83169 


matrix metaOoproteinase 1 (interstrOal 


Seq ID 21 &22 


104865 


T79340 


Hs.22575 


B-ceOCLLflymphoma 6, member B (zinc fi 


Seq ID 23 824 


106124 


H93366 


Hs7567 


Homo sapiens cDNA: FU21962 fis, clone H 


Seq ID 25 & 26 


109001 


AI056548 


Hs72116 


hypothetical protein FU20992 simflar to 


Seq ID 27 & 28 


104754 


AI039243 


Hs.278585 


ESTs 


Seq ID29&30 


133200 


AB037715 


Hs.183639 


hypothetical protein FU 10210 


Seq ID 31 & 32 


105263 


AW388633 


Hs.6682 


solute carrier family 7, (cationic amino 


Seq ID 33 & 34 


102892 


BE440042 


Hs.83326 


matrix metafloproteinase 3 (stromelysin 


SeqlD35&36 


109456 


AW958580 


Hs.42699 


ESTs 


Seq ID 37 & 38 


110906 


AA035211 


Hs.17404 


ESTs 


Seq!D39&40 


119073 


BE245360 


" Hs.279477 


ESTs 


SeqlD41&42 


132050 


AI267615 


Hs.38022 


ESTs 


SeqlD43&44 


132490 


NWL001290 


Hs.4980 


UM domain binding 2 


Seq ID 45 846 


102283 


AW161552 


Hs.83381 


guanine nucleotide binding protein 11 


Seq ID 47 & 43 


101714 


M68874 


Hs.211587 


phospho&pase A2, group IVA (cytosofic, 


SeqlD49&50 


133975 


C18356 


Hs.295944 


tissue factor pathway inhibitor 2 


Seq ID 51 & 52 


106793 


H94997 


Hs.16450 


ESTs 


Seq ID 53 & 54 


118511 


N75620 


Hs.43167 


ESTs 


SeqlD54&55 


101447 


M21305 




gbttuman alpha satellite and satellite 3 


Sea ID 56 & 57 


314941 


AA5159Q2 


Hs.130650 


ESTs 


Seq ID 58 & 59 


332466 


AB018259 


Hs.118140 


WAA0716 gene product 


Seq ID 60 & 61 


313513 


AW298600 


Hs. 141 840 


ESTs, Weakly similar to S59501 interfere 


Seq(D62&63 


313556 


AA628517 


his.118502 


ESTs 


SeqlD64&65 


313665 


AW751201 


Hs.51233 


ESTs 


Seq ID 66 & 67 


314372 


AU040178 


Hs.142003 


ESTs 


Seq ID 68 & 69 


429275 


AF056085 


Hs.198612 


G protein-coupled receptor 51 


Seq ID 70 & 71 


101345 


NMJJ05795 


Hs.152175 


calcitonin receptor-fike 


Seq ID 72 873 


418994 


AA296520 


Hs.89546 


selectin E (endothelial adhesion molecut 


SeqlD74&75 


103850 


AA187101 


Hs.213194 


hypothetical protein MGC10895 


Seq ID 76877 


133260 


AA403045 


Hs.6906 


Homo sapiens cDNA: RJ231 97 6s, clone R 


Seq ID 78 & 79 


101097 


BE245301 


Hs.89414 


chemoWne (C-X-C motif), receptor 4 (fus 


SeqID80&81 


104786 


AA027167 


Hs.10031 


K1AA0955 protein 


SeqlD82&83 


132173 


X89426 


Hs.41716 


endothelial cell-specific molecule 1 


Seq ID 84885 


100420 


D86983 


Hs.118893 


Melanoma associated gene 


SeqID86&87 


111018 


AI287912 


Hs.3628 


mftcgen-acfivated protein kinase kinase 


Seq ID 88 & 89 


108507 


AI554545 


Hs.68301 


ESTs 


SeqlD90&91 


104694 


AF065214 


Hs.18858 


phosphonpase A2, group IVC (cytosoGc, 


SeqlD92&93 


118511 


N75620 


Hs.43157 


ESTs 


SeqlD94&95 


125609 


AA868063 


Hs.104576 


carbohydrate (keratan sulfate Gal-6) sul 


Seq ID 96 & 97 


101543 


M31166 


Hs^050 


pentaxfrwefated gene, rapidly induced b 


Seq ID 98 & 99 


102241 


NM.007351 


Hsl268107 


multimerin 


Seq ID 100 & 101 


101560 


AW958272 


HS347326 


tnterceflufer adhesion molecule 2 


Seq!D102&103 


103280 


U84722 


Hs.76206 


cadherin 5, type 2, VE-cadherin (vascula 


SeqD104&105 


105826 


AA478756 


Hs.194477 


E3 ubkjuitin Ogase SMURF2 


Seq ID 1068107 


102804 


NM_002318 


Hs.83354 


rysyl oxidase-ike 2 


Seq ID 1088109 


131647 


AA359515 


Hs^0089 


ESTs 


SeqlD110&111 


103095 


NMJJ05424 


Hs.78824 


tyrosine kinase with immunogbbuGn and 


SeqlD112&113 


103037 


BE018302 


Hs7894 


placental growth factor, vascular endoth 


SeqtD114&115 


100405 


AW291587 


Hs.82733 


nidogen2 


SeqD116&117 


102012 


BE259035 


Hs.1 18400 


simjed(DrosophflaHjke(sea urchin fas 


SeqlD118&119 
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101 ?ri 

IVI£OI 


UOUOO/ 


Hs 82353 




Sen D 120 & 121 


105790 


U4RR19 

rwooiz 


Hs^93815 


nmnu Sopicfio norv/zoo UuMv\ pofuaj cuo 


Sen ID 122& 123 


10791R 
lU/Zlg 


uoiuoy 


Uc 9HR7Q 

n*i 1 10/3 


nreianorna ceu a ones ion ruoteajis 


Cpn ID 194 A 19R 


131 oro 

1 0I00O 


ktm noioR5 

iHiVv_UU 1300 


He 2271 

TVX,f ft 1 


cjtuozrteun 1 


S«ilD19fi& 127 


101AQR 
I0I4O0 


F06972 


He 97379 

no.£/9/Z 


DiviA norKBcepiw tyrosine Kinase 


Sad ID 12ft A 19Q 

OtAj IU \LO at |£9 


134299 


Mvf9Qu999 


Hs 97199 


m m nlomorrt rrtmnnnpnf CA n rpnpntnr 


SeolD130&13l 


10/QQO 


uzozoo 


Ue 106384 


nfrwlmitqr»Hm_afwjftftorrYy{i4Q evrrftisco 9 / n 

^uoidyfanuJjHenuupcJOAJUo oynuiasc z ip 


Sen ID 132 & 133 


115R97 


AA49flonn 


Hs 283079 


OL-Ull (CKUsU pi(MSin £/0 Hfll l|AUA, OUUUJI 


Sen ID 134 & 135 

lilt Q low 


IMO 14 


MM 003003 






Sen ID 135 & 137 


1 10400 


AI34R901 
HJ040ZOI 


He 751 1ft 


UDKJUlun CcuTXKyHSfllUIlaJ eolclaoc LI 


Cpn ID 13ft A 139 


11954R 


M949A3 

IVl£4£00 


Hs 168383 

no. IUDOQ9 


intonnonirtar orfhocirm rrWfVt lip 1 /^^^4^ 
uiusiuauuiai auittSHWi iiiuicwuic 1 ^L»L*o*t^ 


Sen ID 140 ft 141 
ucvj iu i*tu a ih i 


I 000/0 


AW9A79R9 
AVYZ47 iDZ 




nuaeosioe pnospnotyiaoe 


^pn ID 142 A 143 


1301RA 
I0U1O4 


u co one 
nOOOUD 


no. io iD9 


icUiKJlC oCw uKJUCcQ It 


Cpn in 144 ft 14R 
Ocij iu if* a l*fO 


104/00 


T90R1R 


Me AQR40 


1 as tyrosine Kinase, enuoineuai [venous 


Qjvi in 14R ft 147 


190T71 

iZoO/l 


V ACQ OR 


no. l 1 uooz 


von witieDianG Hnor 


Cpn in 14ft ft 149 
OcvJ lu 1*10 <x 143 


41RROR 
410000 


/VW04Z40 


no.00003 


va proiern-coupiea recepior os 


Can )D 150 ft 1R1 
ocv( iu loo a ioi 


322262 


AAR79019 
AADOZUIZ 


no. 100/ *+o 


P<5Te 

CO 15 


Cpfl in 1C9 £ 1R0 
oci} lu J at lOO 


^19171 


A 1091400 


Ue OA4471 
rI5.0O44/ 1 


COT 


Qon m 1R4 ft 1RR 
lu 104 a 100 


013/90 


AB037821 


Ue 1ARR5A 

no. iwooo 


piuTjDcauneiui lu 


Con in 1Rfi ft 1R7 
o<A( lu lOO a 10/ 


OIOG7Q 


HI0/U1/O 


Uq 13957 

no. 1930/ 


P*^Te 
CO 15 


9on in IRfl ft 1R0 
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TABLE 8 



Seq ID NO: 1 DNA sequence 

Nucleic Acid Accession #: NM_001400 

Coding sequence: 244-2208 (underlined sequences correspond to start and stop codons) ) 



1 11 21 31 41 51 

I I I I I I 

GTCG66GGCA GCAGCAAGAT GCGAAGCGAG CCGTACAGAT CCCGGGCTCT CCGAACGCAA 60 

CTTCGCCCTG CTTGAGCGAG GCTGCGGTTT CCGAGGCCCT CTCCAGCCAA GGAAAAGCTA 120 

CACAAAAAGC CTGGATCACT CATCGAACCA CCCCTGAAGC CAGTGAAGGC TCTCTCGCCT 180 

CGCCCTCTAG CGTTCGTCTG GAGTAGCGCC ACCCCGGCTT CCTGGGGACA CAGGGTTGGC 240 

ACCATGGGGC CCACCAGCGT CCCGCTGGTC AAGGCCCACC GCAGCTCGGT CTCTGACTAC 300 

GTCAACTATG ATATCATCGT CCGGCATTAC AACTACACGG GAAAGCTGAA TATCAGCGCG 360 

GACAAGGAGA ACAGCATTAA ACTGACCTCG GTGGTGTTCA TTCTCATCTG CTGCTTTATC 420 

ATCCTGGAGA ACATCTTTGT CTTGCTGACC ATTTGGAAAA CCAAGAAATT CCACCGACCC 480 

ATGTACTATT TTATTGGCAA TCTGGCCCTC TCAGACCTGT TGGCAGGAGT AGCCTACACA 540 

GCTAACCTGC TCTTGTCTGG GGCCACCACC TACAAGCTCA CTCCCGCCCA GTGGTTTCTG 600 

CGGGAAGGGA GTATGTTTGT GGCCCTGTCA GCCTCCGTGT TCAGTCTCCT CGCCATCGCC 660 

ATTGAGCGCT ATATCACAAT GCTGAAAATG AAACTCCACA ACGGGAGCAA TAACTTCCGC 720 

CTCTTCCTGC TAATCAGCGC CTGCTGGGTC ATCTCCCTCA TCCTGGGTGG CCTGCCTATC 780 

ATGGGCTGGA ACTGCATCAG TGCGCTGTCC AGCTGCTCCA CCGTGCTGCC GCTCTACCAC 840 

AAGCACTATA TCCTCTTCTG CACCACGGTC TTCACTCTGC TTCTGCTCTC CATCGTCATT 900 

CTGTACTGCA GAATCTACTC CTTGGTCAGG ACTCGGAGCC GCCGCCTGAC GTTCCGCAAG 960 

AACATTTCCA AGGCCAGCCG CAGCTCTGAG AAGTCGCTGG CGCTGCTCAA GACCGTAATT 1020 

ATCGTCCTGA GCGTCTTCAT CGCCTGCTGG GCACCGCTCT TCATCCTGCT CCTGCTGGAT 1080 

GTGGGCTGCA AGGTGAAGAC CTGTGACATC CTCTTCAGAG CGGAGTACTT CCTGGTGTTA 1140 

GCTGTGCTCA ACTCCGGCAC CAACCCCATC ATTTACACTC TGACCAACAA GGAGATGCGT 1200 

CGGGCCTTCA TCCGGATCAT GTCCTGCTGC AAGTGCCCGA GCGGAGACTC TGCTGGCAAA 1260 

TTCAAGCGAC CCATCATCGC CGGCATGGAA TTCAGCCGCA GCAAATCGGA CAATTCCTCC 1320 

CACCCCCAGA AAGACGAAGG GGACAACCCA GAGACCATTA TGTCTTCTGG AAACGTCAAC 1380 

TCTTCTTCCT AGAACTGGAA GCTGTCCACC CACCGGAAGC GCTCTTTACT TGGTCGCTGG 1440 

CCACCCCAGT GTTTGGAAAA AAATCTCTGG GCTTCGACTG CTGCCAGGGA GGAGCTGCTG 1500 

CAAGCCAGAG GGAGGAAGGG GGAGAATACG AACAGCCTGG TGGTGTCGGG TGTTGGTGGG 1560 

TAGAGTTAGT TCCTGTGAAC AATGCACTGG GAAGGGTGGA GATCAGGTCC CGGCCTGGAA 1620 

TATATATTCT ACCCCCCTGG AGCTTTGATT TTGCACTGAG CCAAAGGTCT AGCATTGTCA 1680 

AGCTCCTAAA GGGTTCATTT GGCCCCTCCT CAAAGACTAA TGTCCCCATG TGAAAGCGTC 1740 

TCTTTGTCTG GAGCTTTGAG GAGATGTTTT CCTTCACTTT AGTTTCAAAC CCAAGTGAGT 1800 

GTGTGCACTT CTGCTTCTTT AGGGATGCCC TGTACATCCC ACACCCCACC CTCCCTTCCC 1860 

TTCATACCCC TCCTCAACGT TCTTTTACTT TATACTTTAA CTACCTGAGA GTTATCAGAG 1920 

CTGGGGTTGT GGAATGATCG ATCATCTATA GCAAATAGGC TATGTTGAGT ACGTAGGCTG 1980 

TGGGAAGATG AAGATGGTTT GGAGGTGTAA AACAATGTCC TTCGCTCAGG CCAAAGTTTC 2040 

CATGTAAGCG GGATCCGTTT TTTGGAATTT GGTTGAAGTC ACTTTGATTT CTTTAAAAAA 2100 

CATCTTTTCA ATGAAATGTG TTACCATTTC ATATCCATTG AAGCCGAAAT CTGCATAAGG 2160 

AAGCCCACTT TATCTAAATG ATATTAGCCA GGATCCTTGG TGTCCTAGGA GAAACAGACA 2220 

AGCAAAACAA AGTGAAAACC GAATGGATTA ACTTTTGCAA ACCAAGGGAG ATTTCTTAGC 2280 

AAATGAGTCT AACAAATATG ACATCCGTCT TTCCCACTTT TGTTGATGTT TATTTCAGAA 2340 

TCTTGTGTGA TTCATTTCAA GCAACAACAT GTTGTATTTT GTTGTGTTAA AAGTACTTTT 2400 

CTTGATTTTT GAATGTATTT GTTTCAGGAA GAAGTCATTT TATGGATTTT' TCTAACCCGT 2460 

GTTAACTTTT CTAGAATCCA CCCTCTTGTC CCCTTAAGCA TTACTTTAAC TGGTAGGGAA 2520 

CGCCAGAACT TTTAAGTCCA GCTATTCATT AGATAGTAAT TGAAGATATG TATAAATATT 2580 

ACAAAGAATA AAAATATATT ACTGTCTCTT TAGTATGGTT TTCAGTGCAA TTAAACCGAG 2640 

AGATGTCTTG TTTTTTTAAA AAGAATAGTA TTTAATAGGT TTCTGACTTT TGTGGATCAT 2700 
TTTGCACATA GCTTTATCAA CTTTTAAACA TTAATAAACT GATTTTTTTA AAG 



Seq ID NO: 2 protein sequence: 
Protein Accession #: NP 001391 



1 11 21 31 41 51 

I I I I I I 

MGPTSVPLVK AHRSSVSDYV NYDIIVRHYN YTGKLNTSAD KENSIKLTSV VFILICCFII 60 

LENIPVLLTI WKTKKFHRPM YYPIGNLALS DliLAGVAYTA NIiLLSGATTY KLTPAQWFLR 120 

EGSMFVALSA SVPSLIAIAI ERYITMLKMK LHNGSNNFRL FTiLISACWVI SLILGGLPIM 180 

GWNCISALSS CSTVLPLYHK HYILFCTTVF TLLLLSIVXL YCRIYSLVRT RSRRLTFRKN 240 

ISKASRSSEK SLALLKTVII VLSVFIACWA PLFTLLLLDV GCKVKTCDIL FRAEYFLVIiA 300 

VLNSGTNPII YTLTNKBMRR APIRIMSCCK CPSGDSAGKP KRPIIAGMEF SRSKSDNSSH 360 
PQKDEGDNPB TIMSSGNVNS SS 
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Seq ID NO: 3 Nucleotide sequence: 
Nucleic Acid Accession #: NM_016242 

Coding sequence: 79-864 (underlined sequences correspond to start and stop codons) ) 



1 11 21 31 41 51 

I I I I I I 

AAGGCCCTGC CAGCTTGGGA GGGAATTGTC CCTGCCTGCT TCTGGAGAAA GAAGATATTG 60 

ACACCATCTA CGGGCACCAX gGAACTGCTT CAAGTGACCA TTCTTTTTCT TCTGCCCAGT 120 

ATTTGCAGCA GTAACAGCAC AGGTGTTTTA GAGGCAGCTA ATAATTCACT TGTTGTTACT 180 

ACAACAAAAC CATCTATAAC AACACCAAAC ACAGAATCAT TACAGAAAAA TGTTGTCACA 240 

CCAACAACTG GAACAACTCC TAAAGGAACA ATCACCAATG AATTACTTAA AATGTCTCTG 300 

ATGTCAACAG CTACTTTTTT AACAAGTAAA GATGAAGGAT TGAAAGCCAC AACCACTGAT 360 

GTCAGGAAGA ATGACTCCAT CATTTCAAAC GTAACAGTAA CAAGTGTTAC ACTTCCCAAT 420 

GCTGTTTCAA CATTACAAAG TTCCAAACCC AAGACTGAAA CTCAGAGTTC AATTAAAACA 480 

ACAGAAATAC CAGGTAGTGT TCTACAACCA GATGCATCAC CTTCTAAAAC TGGTACATTA 54 0 

ACCTCAATAC CAGTTACAAT TCCAGAAAAC ACCTCACAGT CTCAAGTAAT AGACACTGAG 600 

GGTGGAAAAA ATGCAAGCAC TTCAGCAACC AGCCGGTCTT ATTCCAGTAT TATTTTGCCG 660 

GTGGTTATTG CTTTGATTGT AATAACACTT TCAGTATTTG TTCTGGTGGG TTTGTACCGA 720 

ATGTGCTGGA AGGCAGATCC GGGCACACCA GAAAATGGAA ATGATCAACC TCAGTCTGAT 780 

AAAGAGAGCG TGAAGCTTCT TACCGTTAAG ACAATTTCTC ATGAGTCTGG TGAGCACTCT 840 

GCACAAGGAA AAACCAAGAA CTGACAGCTT GAGGAATTCT CTCCACACCT AGGCAATAAT 900 

TACGCTTAAT CTTCAGCTTC TATGCACCAA GCGTGGAAAA GGAGAAAGTC CTGCAGAATC 960 
AATCCCGACT TCCATACCTG CTGCTGG 



Seq ID NO: 4 Protein sequence: 
Protein Accession #: NP_057326 

1 11 21 31 41 51 

I I I I I I 

MELLQVTILF L.LPSICSSNS TGVLEAANNS LWTTTKPSI TTPNTESLQK NWTPTTGTT 60 
PKGTITNELL KMSLMSTATF LTSKDEGLKA TTTDVRKNDS IISNVTVTSV TLPNAVSTLQ 120 
SSKPKTBTQS SIKTTEIPGS VLQPDASPSK TGTLTSIPVT IPENTSQSQV IDTEGGKNAS 180 
TSATSRSYSS IILPWIALI VITLSVFVLV GLYRMCWKAD PGTPENGNDQ PQSDKESVKL 240 
LTVKTISHES GEHSAQGKTK N 



Seq ID NO: 5 Nucleotide sequence: 
Nucleic Acid Accession #: NM_002205 , 

Coding sequence: 24.. 3173 (underlined sequences correspond to start and stop codons) 



1 11 21 

] I I 

CAGGACAGGG AAGAGCGGGC GCTATGGGGA 
TGCAGCTGCG CTGGGGCCCC CGGCGCCGAC 
TGCCGCCGCC ACCCAGGGTC GGGGGCTTCA 
CGGGGCCCCC GGGCTCCTTC TTCGGATTCT 
GGGTCAGTGT GCTGGTGGGA GCACCCAAGG 
GTGGTGCTGT CTACCTCTGT CCTTGGGGTG 
TTGACAGCAA AGGCTCTCGG CTCCTGGAGT 
CTGTGGAGTA CAAGTCCTTG CAGTGGTTCG 
TCTTGGCATG CGCTCCACTG TACAGCTGGC 
TGGGCACCTG CTACCTCTCC ACAGATAACT 
GCTCAGATTT CAGCTGGGCA GCAGGACAGG 
TCACCAAGAC TGGCCGTGTG GTTTTAGGTG 
TCCTGTCTGC CACTCAGGAG CAGATTGCAG 
TGGTTCAGGG GCAGCTGCAG ACTCGCCAGG 
GATACTCTGT GGCTGTTGGT GAATTCAGTG 
TGCCCAAAGG GAACCTCACT TACGGCTATG 
CCCTCTACAA CTTCTCAGGG GAACAGATGG 
CAGACGTCAA TGGGGACGGG CTGGATGACT 
GGACCCCTGA CGGGCGGCCT CAGGAGGTGG 
CCGGCATAGA GCCCACGCCC ACCCTTACCC 
GCAGCTCCTT GACCCCCCTG GGGGACCTGG 
GGGCTCCCTT TGGTGGGGAG ACCCAGCAGG 



31 41 51 

I I I 

GCCGGACGCC AGAGTCCCCT CTCCACGCCG 60 

CCCCGCTCGT GCCGCTGCTG TTGCTGCTCG 120 

ACTTAGACGC GGAGGCCCCA GCAGTACTCT 180 

CAGTGGAGTT TTACCGGCCG GGAACAGACG 240 

CTAATACCAG CCAGCCAGGA GTGCTGCAGG 300 

CCAGCCCCAC ACAGTGCACC CCCATTGAAT 360 

CCTCACTGTC CAGCTCAGAG GGAGAGGAGC 420 

GGGCAACAGT TCGAGCCCAT GGCTCCTCCA 480 

GCACAGAGAA GGAGCCACTG AGCGACCCCG 540 

TCACCCGAAT TCTGGAGTAT GCACCCTGCC 600 

GTTACTGCCA AGGAGGCTTC AGTGCCGAGT 660 

GACCAGGAAG CTATTTCTGG CAAGGCCAGA 720 

AATCTTATTA CCCCGAGTAC CTGATCAACC 780 

CCAGTTCCAT CTATGATGAC AGCTACCTAG 840 

GTGATGACAC AGAAGACTTT GTTGCTGGTG 900 

TCACCATCCT TAATGGCTCA GACATTCGAT 960 

CCTCCTACTT TGGCTATGCA GTGGCCGCCA 1020 

TGCTGGTGGG GGCACCCCTG CTCATGGATC 1080 

GCAGGGTCTA CGTCTACCTG CAGCACCCAG 1140 

TCACTGGCCA TGATGAGTTT GGCCGATTTG 1200 

ACCAGGATGG CTACAATGAT GTGGCCATCG 1260 

GAGTAGTGTT TGTATTTCCT GGGGGCCCAG 1320 
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GAGGGCTGGG CTCTAAGCCT TCCCAGGTTC TGCAGCCCCT GTGGGCAGCC AGCCACACCC 1380 

CAGACTTCTT TGGCTCTGCC CTTCGAGGAG GCCGAGACCT GGATGGCAAT GGATATCCTG 1440 

ATCTGATTGT GGGGTCCTTT GGTGTGGACA AGGCTGTGGT ATACAGGGGC CGCCCCATCG 1500 

TGTCCGCTAG TGCCTCCCTC ACCATCTTCC CCGCCATGTT CAACCCAGAG GAGCGGAGCT 1560 

5 GCAGCTTAGA GGGGAACCCT GTGGCCTGCA TCAACCTTAG CTTCTGCCTC AATGCTTCTG 1620 

GAAAACACGT TGCTGACTCC ATTGGTTTCA CAGTGGAACT TCAGCTGGAC TGGCAGAAGC 1680 

AGAAGGGAGG GGTACGGCGG GCACTGTTCC TGGCCTCCAG GCAGGCAACC CTGACCCAGA 1740 

CCCTGCTCAT CCAGAATGGG GCTCGAGAGG ATTGCAGAGA GATGAAGATC TACCTCAGGA 1800 

ACGAGTCAGA ATTTCGAGAC AAACTCTCGC CGATTCACAT CGCTCTCAAC TTCTCCTTGG 1860 

10 ACCCCCAAGC CCCAGTGGAC AGCCACGGCC TCAGGCCAGC CCTACATTAT CAGAGCAAGA 1920 

GCCGGATAGA GGACAAGGCT CAGATCTTGC TGGACTGTGG AGAAGACAAC ATCTGTGTGC 1980 

CTGACCTGCA GCTGGAAGTG TTTGGGGAGC AGAACCATGT GTACCTGGGT GACAAGAATG 2040 

CCCTGAACCT CACTTTCCAT GCCCAGAATG TGGGTGAGGG TGGCGCCTAT GAGGCTGAGC 2100 

TTCGGGTCAC CGCCCCTCCA GAGGCTGAGT ACTCAGGACT CGTCAGACAC CCAGGGAACT 2160 

15 TCTCCAGCCT GAGCTGTGAC TACTTTGCCG TGAACCAGAG CCGCCTGCTG GTGTGTGACC 2220 

TGGGCAACCC CATGAAGGCA GGAGCCAGTC TGTGGGGTGG CCTTCGGTTT ACAGTCCCTC 2280 

ATCTCCGGGA CACTAAGAAA ACCATCCAGT TTGACTTCCA GATCCTCAGC AAGAATCTCA 2340 

ACAACTCGCA AAGCGACGTG GTTTCCTTTC GGCTCTCCGT GGAGGCTCAG GCCCAGGTCA 2400 

CCCTGAACGG TGTCTCCAAG CCTGAGGCAG TGCTATTCCC AGTAAGCGAC TGGCATCCCC 2460 

20 GAGACCAGCC TCAGAAGGAG GAGGACCTGG GACCTGCTGT CCACCATGTC TATGAGCTCA 2520 

TCAACCAAGG CCCCAGCTCC ATTAGCCAGG GTGTGCTGGA ACTCAGCTGT CCCCAGGCTC 2580 

TGGAAGGTCA GCAGCTCCTA TATGTGACCA GAGTTACGGG ACTCAACTGC ACCACCAATC 2640 

ACCCCATTAA CCCAAAGGGC CTGGAGTTGG ATCCCGAGGG TTCCCTGCAC CACCAGCAAA 2700 

AACGGGAAGC TCCAAGCCGC AGCTCTGCTT CCTCGGGACC TCAGATCCTG AAATGCCCGG 2760 

25 AGGCTGAGTG TTTCAGGCTG CGCTGTGAGC TCGGGCCCCT GCACCAACAA GAGAGCCAAA 2820 

GTCTGCAGTT GCATTTCCGA GTCTGGGCCA AGACTTTCTT GCAGCGGGAG CACCAGCCAT 2880 

TTAGCCTGCA GTGTGAGGCT GTGTACAAAG CCCTGAAGAT GCCCTACCGA ATCCTGCCTC 2940 

GGCAGCTGCC CCAAAAAGAG CGTCAGGTGG CCACAGCTGT GCAATGGACC AAGGCAGAAG 3000 

GCAGCTATGG CGTCCCACTG TGGATCATCA TCCTAGCCAT CCTGTTTGGC CTCCTGCTCC 3060 

30 TAGGTCTACT CATCTACATC CTCTACAAGC TTGGATTCTT CAAACGCTCC CTCCCATATG 3120 

GCACCGCCAT GGAAAAAGCT CAGCTCAAGC CTCCAGCCAC CTCTGATGCC TGAGTCCTCC 3180 

CAATTTCAGA CTCCCATTCC TGAAGAACCA GTCCCCCCAC CCTCATTCTA CTGAAAAGGA 3240 

GGGGTCTGGG TACTTCTTGA AGGTGCTGAC GGCCAGGGAG AAGCTCCTCT CCCCAGCCCA 3300 

GAGACATACT TGAAGGGCCA GAGCCAGGGG GGTGAGGAGC TGGGGATCCC TCCCCCCCAT 3360 

35 GCACTGTGAA GGACCCTTGT TTACACATAC CCTCTTCATG GATGGGGGAA CTCAGATCCA 3420 

GGGACAGAGG CCCAGCCTCC CTGAAGCCTT TGCATTTTGG AGAGTTTCCT GAAACAACTG 3480 

GAAAGATAAC TAGGAAATCC ATTCACAGTT CTTTGGGCCA GACATGCCAC AAGGACTTCC 3540 

TGTCCAGCTC CAACCTGCAA AGATCTGTCC TCAGCCTTGC CAGAGATCCA AAAGAAGCCC 3600 

CCAGTAAGAA CCTGGAACTT GGGGAGTTAA GACCTGGCAG CTCTGGACAG CCCCACCCTG 3660 

40 GTGGGCCAAC AAAGAACACT AACTATGCAT GGTGCCCCAG GACCAGCTCA GGACAGATGC 3720 

CACAAGGATA GATGCTGGCC CAGGGCCAGA GCCCAGCTCC AAGGGGAATC AGAACTCAAA 3780 

TGGGGCCAGA TCCAGCCTGG GGTCTGGAGT TGATCTGGAA CCCAGACTCA GACATTGGCA 3840 

CCAATCCAGG CAGATCCAGG ACTATATTTG GGCCTGCTCC AGACCTGATC CTGGAGGCCC 3900 

AGTTCACCCT GATTTAGGAG AAGCCAGGAA TTTCCCAGGA CCTGAAGGGG CCATGATGGC 3960 

45 AACAGATCTG GAACCTCAGC CTGGCCAGAC ACAGGCCCTC CCTGTTCCCC AGAGAAAGGG 4020 

GAGCCCACTG TCCTGGGCCT GCAGAATTTG GGTTCTGCCT GCCAGCTGCA CTGATGCTGC 4080 

CCCTCATCTC TCTGCCCAAC CCTTCCCTCA CCTTGGCACC AGACACCCAG GACTTATTTA 4140 

AACTCTGTTG CAAGTGCAAT AAATCTGACC CAGTGCCCCC ACTGACCAGA ACTAGAAAAA 4200 
AAAA 



50 



Seq ID NO: 6 Protein sequence: 
protein Accession ft: NP 002196.1 



55 1 11 21 31 41 51 

I I I I I I 

MGSRTPESPL HAVQLRWGPR RRPPLVPLLL LLVPPPPRVG GPNLDAEAPA VLSGPPGSFP 60 

GFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

IiESSLSSSEG EEPVEYKSLQ WPGATVRAHG SSILACAPLY SWRTEKEPLS DPVGTCYLST 180 

60 DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL INIiVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT LTLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QOGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

65 VDKAWYRGR PIVSASASLT IFPAMFNPEE RSCSLEGNPV ACINLSFCLN ASGKHVAPSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIONGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPAIBYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VPHLRDTKKT IQFDFQILSK NI^NSQSDW 780 

70 SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DDGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALBGQQLLY VTRVTGiaiCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLOCEAV 960 

YKAUCMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 1020 
YKbGFFKRSL PYGTAMEKAQ LKPPATSDA 

75 
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Seq ID NO: 7 Nucleotide sequence: 
Nucleic Acid Accession #: NM_ 
Coding sequence: 104.. 2500 I 



_002211 

"(underlined sequences correspond to start and stop codons) 



1 



11 



21 



31 



41 



51 



GTCCGCCAAA ACCTGGGCGG ATAGGGAAGA ACAGCACCCC GGCGCCGATT GCOGTACCAA 60 
ACAAGCCTAA CGTCCGCTGG GCCCOGGACG CCGCGCGGAA AAGATGAATT TACAACCAAT 120 
TTTCTGGATT GGACTGATCA GTTCAGTTTG CTGTGTGTTT GCTCAAACAG ATGAAAATAG 180 
ATGTTTAAAA GCAAATGCCA AATCATGTGG AGAATGTATA CAAGCAGGGC CAAATTGTGG 240 
GTGGTGCACA AATTCAACAT TTTTACAGGA AGGAATGCCT ACTTCTGCAC GATGTGATGA 300 
TTTAGAAGCC TTAAAAAAGA AGGGTTGCCC TCCAGATGAC ATAGAAAATC CCAGAGGCTC 360 
CAAAGATATA AAGAAAAATA AAAATGTAAC CAACCGTAGC AAAGGAACAG CAGAGAAGCT 420 
CAAGCCAGAG GATATTACTC AGATCCAACC ACAGCAGTTG GTTTTGCGAT TAAGATCAGG 480 
GGAGCCACAG ACATTTACAT TAAAATTCAA GAGAGCTGAA GACTATCCCA TTGACCTCTA 540 
CTACCTTATG GACCTGTCTT ATTCAATGAA AGACGATTTG GAGAATGTAA AAAGTCTTGG 600 
AACAGATCTG ATGAATGAAA TGAGGAGGAT TACTTCGGAC TTCAGAATTG GATTTGGCTC 660 
ATTTGTGGAA AAGACTGTGA TGCCTTACAT TAGCACAACA CCAGCTAAGC TCAGGAACCC 720 
TTGCACAAGT GAACAGAACT GCACCACCCC ATTTAGCTAC AAAAATGTGC TCAGTCTTAC 780 
TAATAAAGGA GAAGTATTTA ATGAACTTGT TGGAAAACAG CGCATATCTG GAAATTTGGA 840 
TTCTCCAGAA GGTGGTTTCG ATGCCATCAT GCAAGTTGCA GTTTGTGGAT CACTGATTGG 900 
CTGGAGGAAT GTTACACGGC TGCTGGTGTT TTCCACAGAT GCCGGGTTTC ACTTTGCTGG 960 

AGATGGGAAA CTTGGTGGCA TTGTTTTACC AAATGATGGA CAATGTCACC TGGAAAATAA 1020 

TATGTACACA ATGAGCCATT ATTATGATTA TCCTTCTATT GCTCACCTTG TCCAGAAACT 1080 

GAGTGAAAAT AATATTCAGA CAATTTTTGC AGTTACTGAA GAATTTCAGC CTGTTTACAA 1140 

GGAGCTGAAA AACTTGATCC CTAAGTCAGC AGTAGGAACA TTATCTGCAA ATTCTAGCAA 1200 

TGTAATTCAG TTGATCATTG ATGCATACAA TTCCCTTTCC TCAGAAGTCA TTTTGGAAAA 1260 

CGGCAAATTG TCAGAAGGAG TAACAATAAG TTACAAATCT TACTGCAAGA ACGGGGTGAA 1320 

TGGAACAGGG GAAAATGGAA GAAAATGTTC CAATATTTCC ATTGGAGATG AGGTTCAATT 1380 

TGAAATTAGC ATAACTTCAA ATAAGTGTCC AAAAAAGGAT TCTGACAGCT TTAAAATTAG 1440 

GCCTCTGGGC TTTACGGAGG AAGTAGAGGT TATTCTTCAG TACATCTGTG AATGTGAATG 1500 

CCAAAGCGAA GGCATCCCTG AAAGTCCCAA GTGTCATGAA GGAAATGGGA CATTTGAGTG IS 60 

TCGCGCGTGC AGGTGCAATG AAGGGCGTGT TGGTAGACAT TGTGAATGCA GCACAGATGA 1620 

AGTTAACAGT GAAGACATGG ATGCTTACTG CAGGAAAGAA AACAGTTCAG AAATCTGCAG 1680 

TAACAATGGA GAGTGCGTCT GCGGACAGTG TGTTTGTAGG AAGAGGGATA ATACAAATGA 1740 

AATTTATTCT qqqjj^^^ gCGAGTGTGA TAATTTCAAC TGTGATAGAT CCAATGGCTT 1800 

AATTTGTGGA GGAAATGGTG TTTGCAAGTG TCGTGTGTGT GAGTGCAACC CCAACTACAC 1860 

TGGCAGTGCA TGTGACTGTT CTTTGGATAC TAGTACTTGT GAAGCCAGCA ACGGACAGAT 1920 

CTGCAATGGC CGGGGCATCT GCGAGTGTGG TGTCTGTAAG TGTACAGATC CGAAGTTTCA 1980 

AGGGCAAACG TGTGAGATGT GTCAGACCTG CCTTGGTGTC TGTGCTGAGC ATAAAGAATG 2040 

TGTTCAGTGC AGAGCCTTCA ATAAAGGAGA AAAGAAAGAC ACATGCACAC AGGAATGTTC 2100 

CTATTTTAAC ATTACCAAGG TAGAAAGTCG GGACAAATTA CCCCAGCCGG TCCAACCTGA 2160 

TCCTGTGTCC CATTGTAAGG AGAAGGATGT TGACGACTGT TGGTTCTATT TTACGTATTC 2220 

AGTGAATGGG AACAACGAGG TCATGGTTCA TGTTGTGGAG AATCCAGAGT GTCCCACTGG 2280 

TCCAGACATC ATTCCAATTG TAGCTGGTGT GGTTGCTGGA ATTGTTCTTA TTGGCCTTGC 2340 

ATTACTGCTG ATATGGAAGC TTTTAATGAT AATTCATGAC AGAAGGGAGT TTGCTAAATT 2400 

TGAAAAGGAG AAAATGAATG CCAAATGGGA CACGGGTGAA AATCCTATTT ATAAGAGTGC 2460 

CGTAACAACT GTGGTCAATC CGAAGTATGA GGGAAAATGA GTACTGCCCG TGCAAATCCC 2520 

ACAACACTGA ATGCAAAGTA GCAATTTCCA TAGTCACAGT TAGGTAGCTT TAGGGCAATA 2580 

TTGCCATGGT TTTACTCATG TGCAGGTTTT GAAAATGTAC AATATGTATA ATTTTTAAAA 2640 

TGTTTTATTA TTTTGAAAAT AATGTTGTAA TTCATGCCAG GGACTGACAA AAGACTTGAG 2700 

ACAGGATGGT TATTCTTGTC AGCTAAGGTC ACATTGTGCC TTTTTGACCT TTTCTTCCTG 2760 

GACTATTGAA ATCAAGCTTA TTGGATTAAG TGATATTTCT ATAGCGATTG AAAGGGCAAT 2820 

AGTTAAAGTA ATGAGCATGA TGAGAGTTTC TGTTAATCAT GTATTAAAAC TGATTTTTAG 2880 

CTTTACATAT GTCAGTTTGC AGTTATGCAG AATCCAAAGT AAATGTCCTG CTAGCTAGTT 2940 

AAGGATTGTT TTAAATCTGT TATTTTGCTA TTTGCCTGTT AGACATGACT GATGACATAT 3000 

CTGAAAGACA AGTATGTTGA GAGTTGCTGG TGTAAAATAC GTTTGAAATA GTTGATCTAC 3060 

AAAGGCCATG GGAAAAATTC AGAGAGTTAG GAAGGAAAAA CCAATAGCTT TAAAACCTGT 3120 

GTGCCATTTT AAGAGTTACT TAATGTTTGG TAACTTTTAT GCCTTCACTT TACAAATTCA 3180 

AGCCTTAGAT AAAAGAACCG AGCAATTTTC TGCTAAAAAG TCCTTGATTT AGCACTATTT 3240 

ACATACAGGC CATACTTTAC AAAGTATTTG CTGAATGGGG ACCTTTTGAG TTGAATTTAT 3300 

TTTA1TATTT TTATTTTGTT TAATGTCTGG TGCTTTCTAT CACCTCTTCT AATCTTTTAA 3360 

TGTATTTGTT TGCAATTTTG GGGTAAGACT TTTTTATGAG TACTTTTTCT TTGAAGTTTT 3420 

AGCGGTCAAT TTGCCTTTTT AATGAACATG TGAAGTTATA CTGTGGCTAT GCAACAGCTC 3480 

TCACCTACGC GAGTCTTACT TTGAGTTAGT GCCATAACAG ACCACTGTAT GTTTACTTCT 3540 

CACCATTTGA GTTGCCCATC TTGTTTCACA CTAGTCACAT TCTTGTTTTA AGTGCCTTTA 3600 
GTTTTAACAG TTCA 

Seq ID NO: 8 Protein sequence: 
Protein Accession #: NP_002202 



1 11 21 31 41 51 



MNLQPIFWIG LISSVCCVFA QTDENRCLKA NAKSCGECIQ AGPNOGWCTN STFLQEGMPT 60 
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10 



15 



SARCDDLEAL 
LRLRSGEPQT 
RIGFGSFVEK 
ISGKLDSPEG 
CHLENNMYTM 
SANSSNVIQL 
GDEVQFBISI 
NGTFECGACR 
RDNTNEIYSG 
ASNGQICNGR 
CTQECSYFNI 
PECPTGPDII 
PIYKSAVTTV 



KKKGCPPDDI 
FTLKFKRAED 
TVMPYISTTP 
GFDAIMQVAV 
SHYYDYPSIA 
IIDAYNSLSS 
TSNKCPKKDS 
CNEGRVGRHC 
KPCECDNFNC 
GICECGVCKC 
TKVESRDKLP 
PIVAGWAGI 
VNPKYEGK 



ENPRGSKDIK 
YPIDLYYLMD 
AKLRNPCTSE 
CGSLIGWRNV 
HLVQKLSENN 
EVILBNGKLS 
DSFKIRPLGF 
ECSTDEVNSE 
DRSNGLICGG 
TDPKFQGQTC 
QPVQPDPVSH 
VLIGLALLLI 



KNKNVTNRSK 
LSYSMKDDLE 
QNCTSPFSYK 
TRLLVFSTDA 
IQTIFAVTEE 
EGVTISYXSY 
TEBVEVILQY 
DMDAYCRKEN 
NGVCKCRVCB 
EMCQTCLGVC 
CKEKDVDDCW 
WKLLMIIHDR 



GTAEKLKPED 
NVKSLGTDLW 
NVLSLTNKGE 
GFHFAGDGKL 
FQPVYKEIiKN 
CKNGVNGTGE 
ICBCECQSEG 
SSEICSNNGE 
CNPNYTGSAC 
AEHKECVQCR 
FYFTYSVNGN 
REFAKFEKEK 



ITQIQPQQLV 
NEMRRITSDF 
VFNELVGKQR 
GGIVLPNDGQ 
LIPKSAVGTL 
NGRKCSNISI 
IPESPKCHEG 
CVCGQCVCRK 
DCSXiDTSTCE 
AFNKGBKKDT 
NEVMVHWEN 
MNAKWDTGEN 



120 
180 
240 
300 
360 
420 
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780 



Seq ID NO: 9 Nucleotide sequence; ' 
Nucleic Acid Accession #:NM_00242S 

Coding sequence: 23.. 1453 (underlined sequences correspond to start and stop codons) 



20 



25 



30 



35 



40 



45 



50 



i 
I 

AAAGAAGGTA 
AGTCTGCTCT 
TGCCCAGCAA 
AAAGGACAGT 
GGTGACAGGG 
TCCTGACGTT 
TACATACAGG 
TGAGAAAGCT 
AGGAGAGGCT 
TGATGGCCCA 
TATTCACTTT 
CGTTGCTGCT 
TTTGATGTAC 
TGATGTGAAT 
GGTGCCCACA 
GTCCTTCGAT 
TTGGCGAAGA 
CTCTCTTCCA 
TTTTAAAGGA 
AGGCATCCAT 
CAAGGAAAAG 
TAGCCAGTCC 
GCCTAAGGTT 
ACAGTTTGAG 
GTTACATTGC 
ATTATTCATC 
GAAGAAGATG 
ACTTGCTTTT 
ATGTATTTTC 
CTT 



11 
I 

AGGGCAGTGA 
GCCTATCCTC 
TACCTAGAAA 
AATCTCATTG 
AAGCTAGACA 
GGTCACTTCA 
ATTGTGAATT 
CTGAAAGTCT 
GATATAATGA 
GGACACAGTT 
GATGATGATG 
CATGAACTTG 
CCACTCTACA 
GGCATTCAGT 
AAATCTGTTC 
GCCATCAGCA 
TCCCACTGGA 
TCATATTTGG 
AATGAGTTCT 
ACCCTGGGTT 
AAGAAAACAT 
ATGGAGCAAG 
GATGCTGTAT 
TTTGACCCCA 
TAGGCGAGAT 
TAATGTATTA 
AGCCTTGCAG 
GAATTGCACT 
ATAGATGTGT 



21 
I 

GAATGATGCA 
TGAGTGGGGC 
AGTACTACAA 
TTAAAAAAAT 
CTGACACTCT 
GCTCCTTTCC 
ATACACCAGA 
GGGAAGAGGT 
TCTCTTTOGC 
TGGCTCATGC 
AAAAATGGAC 
GCCACTCCCT 
ACTCATTCAC 
CTCTCTACGG 
CTTCGGGATC 
CTCTGAGGGG 
ACCCTGAACC 
ATGCTGCATA 
GGGCCATCAG 
TTCCTCCAAC 
ACTTCTTTGC 
GCTTCCCTAG 
TACAGGCATT 
ATGCCAGGAT 
AGGGGGAAGA 
TGAGCCAAAA 
ATATCTGCAT 
GAACAGAATT 
TATTACTTCC 



31 
I 

TCTTGCATTC 
AGCAAAAGAG 
CCTCGAAAAG 
CCAAGGAATG 
GGAGGTGATG 
TGGCATGCCG 
TTTGCCAAGA 
GACTCCACTC 
AGTTAAAGAA 
CTACCCACCT 
AGAAGATGCA 
GGGGCTCTTT 
AGAGCTCGCC 
ACCTCCCCCT 
TGAGATGCCA 
AGAATATCTG 
TGAATTTCAT 
TGAAGTTAAC 
AGGAAATGAG 
CATAAGGAAA 
AGCGGACAAA 
ACTAATAGCT 
TGGATTTTTC 
GGTGACACAC 
CAGATATGGG 
TGGTTAATTT 
GTGTCATGAA 
AAGAAATACT 
TCAATAAAAA 



41 
I 

CTTGTGCTGT 
GAGGACTCCA 
GATGTGAAAC 
CAGAAGTTCC 
CGCAAGCCCA 
AAGTGGAGGA 
GATGCTGTTG 
ACATTCTCCA 
CATGGAGACT 
GGACCTGGGC 
TCAGGGACCA 
CACTCAGCCA 
CAGTTCCGCC 
GCCTCTACTG 
GCCAAGTGTG 
TTCTTTAAAG 
TTGATTTCTG 
AGCAGGGACA 
GTACAAGCAG 
ATTGATGCAG 
TACTGGAGAT 
GATGACTTTC 
TACTTCTTCA 
ATATTAAAGA 
TGTTTTTAAT 
TTCCTGCATG 
GAATGTTTCT 
CATGTGCAAT 
GTTTTATTTT 



51 
I 

TGTGTCTGCC 
ACAAGGATCT 
AGTTTAGAAG 
TTGGGTTGGA 
GGTGTGGAGT 
AAACCCACCT 
ATTCTGCCAT 
GGCTGTATGA 
TTTACTCTTT 
TTTATGGAGA 
ATTTATTCCT 
ACACTGAAGC 
TTTCGCAAGA 
AGGAACCCCT 
ATCCTGCTTT 
ACAGATATTT 
CATTTTGGCC 
CCGTTTTTAT 
GTTATCCAAG 
CTGTTTCTGA 
TTGATGAAAA 
CAGGAGTTGA 
GTGGATCATC 
GTAACAGCTG 
AAATCTAATA 
TTCTGTGACT 
GGAATTCTTC 
AGGTGAGAGA 
GGGCCTGTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



55 



Seq ID NO: 10 Protein sequence: 
Protein Accession #.* NP 002416 



60 



65 



MMHLAFLVLL 
KKIQGMQKFL 
TPDLPRDAVD 
AHAYPPGPGL 
SFTELAQFRL 
LRGEYLFFKD 
AIRGNEVQAG 
FPRLIADDFP 



11 
I 

CLPVCSAYPL 
GLEVTGKLDT 
SAIEKALKVW 
YGDIHFDDDE 
SQDDVNGIQS 



YPRGIHTbGF 
GVEPKVDAVL 



21 
I 

SGAAKEEDSN 
DTIiEVMRKPR 
EEVTPLTFSR 
KWTEDASGTN 
LYGPP PASTE 
PEPEFHLISA 
PPTIRKIDAA 
QAFGFFYFFS 



31 
I 

KDIAQQYLEK 
CGVPDVGHFS 
LYEGEADIMI 
LFLVAAHELG 
EPLVPTKSVP 
EWPSLPSYIiD 
VSDKEKKKTY 
GSSQFEFDPN 



41 
I 

YYNLEKDVKQ 
SFPGMPKWRK 
SFAVKEHGDF 
HSDGLFHSAN 
SGSEMPAKCD 
AAYEVNSRDT 
FFAADKYWRF 
ARMVTHIliKS 



51 

I 

FRRKDSNLIV 
THLTYRIVNY 
YSFDGPGHSL 
TEALMYPLYN 
PALSFDAIST 
VFIFKGNEFW 
DENSQSMEQG 
NSWLHC 



60 
120 
180 
240 
300 
360 
420 
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Seq ID NO: 11 Nucleotide sequence: 
Nucleic Acid Accession §: XM_058189 

Coding sequence: 169. .774 (underlined sequences correspond to start and stop codons) 



41 



51 



1 11 21 31 

I I I I ! I 

GAAGACCAGC TCAGCTCTTC AGTTGTTGAT CATTGTCTAT TGTTCTCCAA ACAGTAAACC 



60 
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AGTATTTCAC ACTGAGATTG TCGGCTGCGG GTATATTCCA ATTCCCCGTC TCCTCATGAA 120 

TATGAAGTGA AGGGCTCTGA CCCTGGAAGT GGTTCTAAGC AGGGCAAAAT GG GGTCTCGG 180 

AAGTGTGGAG GCTGCCTAAG TTGTTTGCTG ATTCCGCTTG CACTTTGGAG TATAATCGTG 240 

AACATATTAT TGTATTTCCC GAATGGGCAA ACTTCCTATG CATCCAGCAA TAAACTCACC 300 

AACTACGTGT GGTATTTTGA AGGAATCTGT TTCTCAGGCA TCATGATGCT TATAGTAACA 360 

ACAGTTCTTC TGGTACTGGA GAATAATAAC AACTATAAAT GTTGCCAGAG TGAAAACTGC 420 

AGCAAAAAAT ATGTGACACT GCTGTCAATT ATCTTTTCTT CCCTCGGAAT TGCTTTTTCT 480 

GGATACTGCC TGGTCATCTC TGCCTTGGGT CTTGTCCAAG GGCCATATTG CCGCACCCTT 540 

GATGGCTGGG AGTATGCTTT TGAAGGCACT GCTGGACGTT TCCTTACAGA TTCTAGCATA 600 

TGGATTCAGT GCCTGGAACC TGCACATGTT GTGGA6TGGA ACATCATTTT ATTTTCCATT 660 

CTCATAACCC TCAGTGGGCT TCAAGTGATC ATCTGCCTCA TCAGAGTAGT CATGCAACTA 720 

TCCAAGATAC TGTGTGGAAG CTATTCAGTG ATCTTCCAGC CTGGAATCAT TTGAATAAGG 780 

ACAAAATGTT TTCCATTATC AAGACATGGC CATCTATCTA AATATTATAT CAACTGTGTA 840 

GACTTGAGGG CAATATTGAA ATGATGGTGC TTTCTGCATT TGGTGTTTAT TTGTAAAAAA 900 

TTTGCAGTCC TCACTGCACA TGCAAGTATA CGACCCTTCC ATTTAGTATG TTTTTTAAGT 960 

AATATGCATC AGAAACTTCA GAAATACTTC TGCCCTTTGA TCAAACAAAT CCATTTCCAA 1020 

GAATCTGTAC TAGGGAAGTA AATAAGAATA TGAGAGAAAC CTTTATGCAA ATATGTATAT 1080 

TGCAACATTA TTTAATATTC TGGAAAATTG GAAACACCCC AAAATTCTAA ACTCAGAGGA 1140 

AGGATTAAGT AAAGAGTGGT ACATACTGTA AATGTTTTCT GATATTAAAA AAAAAATTAA 1200 
ATAAAAAATA AAGAGTACTA CATGGTTGTA AAA 

Seq ID NO: 12 Protein sequence; 
Protein Accession #: XP 058189 



1 11 21 31 41 51 

111)11 

MGSRKCGGCL SCLLIPLALW SIIVNIIiLYF PNGOTSYASS NKLTNYVWYP EGICFSGIMM 60 
LIVTTVXxLVL ENNNNYKCCQ SENCSKKYVT LLSIIPSSLG IAFSGYCLVI SALGLVQGPY 120 
CRTLDGWBYA PSGTAGRFLT DSSIWIQCLE PAHWEWNII LFSILITLSG LQVIICLIRV 180 
VMQLSKILCG SYSVIFQPGI I 



Seq ID NO: 13 Nucleotide sequence; 
Nucleic Acid Accession #: NM_005397 

Coding sequence: 251.. 1837 (underlined sequences correspond to start and stop codons) 



AAACGCCGCC 
CAGCCCGGCT 
CGGGCCACAG 
TACCGCCCGG 
CGACGACACG 
GCCGCTGCTG 
CCAGACTACT 
CATGGCTACA 
GGCCTCGGTC 
GGCTCAGCAA 
CCCTACTACC 
AACCTCCACA 
AACAAACTCT 
AGAACATCTG 
GCATCCTGTG 
TTCAAGCACT 
ACCGTCATCG 
TACGGCCCCT 
ACCTACCCTG 
CCCCAAAACA 
TGAGACACAG 
TGCAGGGGGC 
CTTCAACCCG 
CGTGGTCGTC 
GCTGAAGGAC 
CCAGGGGCCA 
CTGCATGGCG 
CTCCCAGAGG 
CCATGACAAC 
GGTCAGCCTC 
GGACGACCTG 
CAGCACCACA 
CTGGGGAGGG 
TTTCCCTTTT 
CTAGTGCCTG 
GACTTTATAG 
GAGGGTAAGT 



11 
I 

CAGGACGCAG 
CTGCTGCAGC 
CCTGGCCTCC 
ACGCGCGGAT 
ATGCGCTGCG 
CCGTCGTCGC 
ACGGACTCAT 
GATACAGCCC 
AAGGCGACCA 
GTCTCAGGCC 
ACCATCGAGA 
GCCACAGCTA 
GGGGGGAAAA 
ACGACCCCTC 
GCCACCCCAA 
GTGGCTATCC 
GTTATCTCGC 
TCCTCCCAGG 
CCAGAGACCA 
CCTTCTCCCA 
ACACAGAGTG 
GCTTCGGATG 
GCCCAAGATA 
AAAGAAATCA 
AAATGGGATG 
CCGGAGGAGG 
TCATTCCTGC 
AAGGACCAGC 
CCAACACTGG 
AACGGGGAGC 
GATGAGGAGG 
GAGCTCCAGA 
AGAGTGAACT 
CAACCTGAAC 
AGCTCAGTGC 
ATGAACTAGT 
GACTTGCCCA 



21 

I 

CCGCCGCCGC 
GGCAGGGAGG 
GGAGCCACCC 
CCTCCGCCGG 
CGCTGGCGCT 
CGTCGCCGTC 
CTAACAAAAC 
AGCAGAGCAC 
CCCTTGGTGT 
CAGTCAACAC 
GCCCCAAGAG 
AACCTAACAC 
GCAGCCACAG 
ACCCTACAAG 
CAAGCTCGGG 
CTGGCTACAC 
AAAGAACTCA 
AGACAGTGCA 
TGAGCTCCAG 
CTGTGGCTCA 
AGAAGCAGCT 
AGAAATTGAT 
AGTGCGGCAT 
CTATTCACAC 
AACTAAAGGA 
CCGAGGACOG 
TCCTCGTGGC 
AGCGGCTAAC 
AAGTGATGGA 
TGGGGGACAG 
AAGACACACA 
CCAACCACCC 
CCGAGGGGTG 
AAATCACATT 
TGCTGGATGA 
GGAATCCCTT 
AGGTCAGAGC 



31 
I 

CGCCGCTCCT 
AAGAGCCGCC 
ACAGGCCTCC 
CACCGCAGCC 
CTCGGCGCTG 
GCCGTCGCCG 
AGCACCGACT 
AGTCCCCACT 
ATCCAGTGAC 
TACCGTGGCT 
CACAAAAAGT 
CACAAGCAGC 
TGTGACCACA 
TCCACTTAGC 
ACATGACCAT 
CTTCACAAGC 
ACAGACCTCC 
GCCCACGAGC 
CCCCACAGCA 
TGAGAGTAAC 
CGTCCTGAAC 
CTCACTGATA 
ACGGCTGGCA 
TAAGCTCCCT 
GGCAGGGGTC 
CTTCAGCATG 
GGCCCTCTAT 
AGAGGAGCTG 
GACCTCTTCT 
CTGGATOGTC 
CCT CTAGT CC 
CAAGTGCCGT 
TCCCCTCCCA 
CTGTCCAGAT 
TGAGGGAGAT 
CATTCTGCAG 
CACTTGGTGA 



41 
I 

CTGCCACTGG 
GCAGCGCGAC 
CCGGGCGGCG 
ACCTGCTCCC 
CTGCTACTGT 
TCGCCCTCCC 
CCAGCATCCA 
TCCAAGGCCA 
TCACCGGGGA 
AGAGGAGGCG 
GCAGACACCA 
CAGAATGGAG 
GACCTCACAT 
CCCCGACAAC 
CTTATGAAAA 
CCGGGGATGA 
AGTCAGATGC 
CCGGCAACGG 
GCATCAACTA 
TGGGCAAAGT 
CTCACAGGAA 
TGCCGAGCAG 
TCTGTTCCAG 
GCCAAGGATG 
AGTGACATGA 
CCCCTCATCA 
GGCTGCTGCC 
CAGACAGTGG 
GAGATGCAGG 
CCTCTGGACA 
GGTCTGCCGG 
TTGGATGGGG 
ATCCCCCCAG 
TCCTCTTGTA 
CAAGAAAAAG 
TGAGATTGCC 
CAGAGCCAGG 



51 
I 

CTCTGCGCCC 
TCGGGAGCCC 
CCCACGCTCC 
GGCCCAGAGG 
TGTCAACGCC 
AGAATGCAAC 
GTGTCACCAT 
ACGAAATCTT 
CTACAACCCT 
GCTCAGGCAA 
CTACAGTTGC 
CAGAAGATAC 
CCACTAAGGC 
CCACTTTGAC 
TTTCAAGCAG 
CCACCACCCT 
CAGCCAGCTC 
CATTGAGAAC 
CCCACCGATA 
GTGAGGATCT 
ACACCCTCTG 
TCAAAGCCAC 
GAAGTCAGAC 
TGTACGAGCG 
AGCTAGGGGA 
TCACCATCGT 
ACCAGCGCCT 
AGAATGGTTA 
AGAAGAAGGT 
ACCTGACCAA 
TGGCCTCCAG 
AAGGGAAAGA 
GGCCTTAATT 
AAATAACCCA 
CCACGTAAGG 
GAGACCTGAA 
ATGAGAACAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1960 
2040 
2100 
2160 
2220 



184 



WO 02/079492 
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AGATTCCATT TGCACCATGC CACACTGCTG TGTTCACATG TGCCTTCCGT CCAGAGCAGT 2280 

CCCGGGCAGG GGTGAAACTC CAGCAGGTGG CTGGGCTGGA AAGGAGGGCA GGGCTACATC 2340 

CTGGCTCGGT GGGATCTGAC GAOCTGAAAG TCCAGCTCCC AAGTTTTCCT TCTCCTACCC 2400 

CAGCCTCGTG TACCCATCTT CCCACCCTCT ATGTTCTTAC CCCTCCCTAC ACTCAGTGTT 2460 

5 TGTTCCCACT TACTCTGTCC TGGGGCCTCT GGGATTAGCA CAGGTTATTC ATAACCTTGA 2520 

ACCCCTTGTT CTGGATTCGG ATTTTCTCAC ATTTGCTTCG TGAGATGGGG GCTTAACCCA 2580 

CACAGGTCTC CGTGCGTGAA CCAGGTCTGC TTAGGGGACC TGCGTGCAGG TGAGGAGAGA 2640 

AGGGGACACT OGAGTCCAGG CTGGTATCTC AGGGCAGCTG ATGAGGGGTC AGCAGGAACA 2700 

CTGGCCCATT GCCCCTGGCA CT CCTTGCAG AGGCCACCCA CGATCTTCTT TGGGCTTCCA 2760 

10 TTTCCACCAG GGACTAAAAT CTGCTGTAGC TAGTGAGAGC AGCGTGTTCC TTTTGTTGTT 2820 

CACTGCTCAG CTGATGGGAG TGATTCCCTG AGACCCAGTA TGAAAGAGCA GTGGCTGCAG 2880 

GAGAGGCCTT CCCGGGGCCC CCCATCAGCG ATGTGTCTTC AGAGACAATC CATTAAAGCA 2940 

GCCAGGAAGG ACAGGCTTTC CCCTGTATAT CATAGGAAAC TCAGGGACAT TTCAAGTTGC 3000 

TGAGAGTTTT GTTATAGTTG TTTTCTAACC CAGCCCTCCA CTGCCAAAGG CCAAAAGCTC 3060 

15 AGACAGTTGG CAGACGTCCA GTTAGCTCAT CTCACTCACT CTGATTCTCC TGTGCCACAG 3120 

GAAAAGAGGG CCTGGAAAGC GCAGTGCATG CTGGGTGCAT GAAGGGCAGC CTGGGGGACA 3180 

GACTGTTGTG GGAACGTCCC ACTGTCCTGG CCTGGAGCTA GG CCTTGCTG TTCCTCTTCT 3240 

CTGTGAGCCT AGTGGGGCTG CTGCGGTTCT CTTGCAGTTT CTGGTGGCAT CTCAGGGGAA 3300 

CACAAAAGCT ATGTCTATTC CCCAATATAG GACTTTTATG GGCTCGGCAG TTAGCTGCCA 3360 

20 TGTAGAAGGC TCCTAAGCAG TGGGCATGGT GAGGTTTCAT CTGATTGAGA AGGGGGAATC 3420 

CTGTGTGGAA TGTTGAACTT TCGCCATGGT CTCCATCGTT CTGGGCGTAA ATTCCCTGGG 3480 

ATCAAGTAGG AAAATGGGCA. GAACTGCTTA GGGGAATGAA ATTGCCATTT TTCGGGTGAA 3540 

ACGCCACACC TCCAGGGTCT TAAGAGTCAG GCTCCGGCTG TAGTAGCTCT GATGAAATAG 3600 

GCTATCCACT CGGGATGGCT TACTTTTTAA AAGGGTAGGG GGAGGGGCTG GGGAAGATCT 3660 

25 GTCCTGCACC ATCTGCCTAA TTCCTTCCTC ACAGTCTGTA GCCATCTGAT ATCCTAGGGG 3720 

GAAAAGGAAG GCCAGGGGTT CACATAGGGC CCCAGCGAGT TTCCCAGGAG TTAGAGGGAT 3780 

GCGAGGCTAA CAAGTTCCAA AAACATCTGC CCCGATGCTC TAGTGTTTGG AGGTGGGCAG 3840 

GATGGAGAAC AGTGCCTGTT TGGGGGAAAA CAGGAAATCT TGTTAGGCTT GAGTGAGGTG 3900 

TTTGCTTCCT TCTTGCCCAG CGCTGGGTTC TCTCCACCCA GTAGGTTTTC TGTTGTGGTC 3960 

30 CCGTGGGAGA GGCCAGACTG GATTATTCCT CCTTTGCTGA TCCTGGGTCA CACTTCACCA 4020 

GCCAGGGCTT TTGACGGAGA CAGCAAATAG GCCTCTGCAA ATCAATCAAA GGCTGCAACC 4080 

CTATGGCCTC TTGGAGACAG ATGATGACTG GCAAGGACTA GAGAGCAGGA GTGCCTGGCC 4140 

AGGTCGGTCC TGACTCTCCT GACTCTCCAT CGCTCTGTCC AAGGAGAACC CGGAGAGGCT 4200 

CTGGGCTGAT TCAGAGGTTA CTGCTTTATA TTCGTCCAAA CTGTGTTAGT CTAGGCTTAG 4260 

35 GACAGCTTCA GAATCTGACA CCTTGCCTTG CTCTTGCCAC CAGGACACCT ATGTCAACAG 4320 

GCCAAACAGC CATGCATCTA TAAAGGTCAT CATCTTCTGC CACCTTTACT GGGTTCTAAA 4380 

TGCTCTCTGA TAATTCAGAG AGCATTGGGT CTGGGAAGAG GTAAGAGGAA CACTAGAAGC 4440 

TCAGCATGAC TTAAACAGGT TGTAGCAAAG ACAGTTTATC ATCAACTCTT TCAGTGGTAA 4500 

ACTGTGGTTT CCCCAAGCTG CACAGGAGGC CAGAAACCAC AAGTATGATG ACTAGGAAGC 4560 

40 CTACTGTCAT GAGAGTGGGG AGACAGGCAG CAAAGCTTAT GAAGGAGGTA CAGAATATTC 4620 

TTTGCGTTGT AAGACAGAAT ACGGGTTTAA TCTAGTCTAG GCRCCAGATT TTTTTCCCGC 4680 

. TTGATAAGGA AAGCTAGCAG AAAGTTTATT TAAACCACTT CTTGAGCTTT ATCTTTTTTG 4740 

ACAATATACT GGAGAAACTT TGAAGAACAA GTTCAAACTG ATACATATAC ACATATTTTT 4800 

TTGATAATGT AAATACAGTG ACCATGTTAA CCTACCCTGC ACTGCTTTAA GTGAACATAC 4860 

45 TTTGAAAAAG CATTATGTTA GCTGAGTGAT GGCCAAGTTT TTTCTCTGGA CAGGAATGTA 4920 

AATGTCTTAC TGGAAATGAC AAGTTTTTGC TTGATTTTTT TTTTTAAACA AAAAATGAAA 4980 

TATAACAAGA CAAACTTATG ATAAAGTATT TGTCTTGTAG ATCAGGTGTT TTGTTTTGTT 5040 

TTTTTAATTT TAAAATGCAA CCCTGCCCCC TCCCCAGCAA AGTCACAGCT CCATTTCAGT 5100 

AAAGGTTGGA GTCAATATGC TCTGGTTGGC AGGCAACCCT GTAGTCATGG AGAAAGGTAT 5160 

50 TTCAAGATCT AGTCCAATCT TTTTCTAGAG AAAAAGATAA TCTGAAGCTC ACAAAGATGA 5220 

AGTGACTTCC TCAAAATCAC ATGGTTCAGG ACAGAAACAA GATTAAAACC TGGATCCACA 5280 

GACTGTGCGC CTCAGAAGGA ATAATCGGTA AATTAAGAAT TGCTACTCGA AGGTGCCAGA 5340 

ATGACACAAA GGACAGAATT CCTTTCCCAG TTGTTACCCT AGCAAGGCTA GGGAGGGCAT 5400 

GAACACAAAC ATAAGAACTG GTCTTCTCAC ACTTTCTCTG AATCATTTAG GTTTAAGATG 5460 

55 TAAGTGAACA ATTCTTTCTT TCTGCCAAGA AACAAAGTTT TGGATGAGCT TTTATATATG 5520 

GAACTTACTC CAACAGGACT GAGGGACCAA GGAAACATGA TGGGGGAGGC AAGAGAGGGC 5580 

AAAGAGTAAA ACTGTAGCAT AGCTTTTGTC ACGGTCACTA GCTGATCCCT CAGGTCTGCT 5640 

GCAAACACAG CATGGAGGAC ACAGATGACT CTTTGGTGTT GGTCTTTTTG TCTGCAGTGA 5700 

ATGTTCAACA GTTTGCCCAG GAACTGGGGG ATCATATATG TCTTAGTGGA CAGGGGTCTG 5760 

60 AAGTACACTG GAATTTACTG AGAAACTTGT TTGTAAAAAC TATAGTTAAT AATTATTGCA 5820 
TTTTCTTACA AAAATATATT TTGGAAAATT GTATACTGTC AATTAAAGT 

Seq ID NO: 14 Protein sequence; 
Protein Accession #: NP_005388 
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MRCALAIiSAL LLLLSTPPLL PSSPSPSPSP SPSQNATQTT TDSSNKTAPT PASSVTIMAT 60 

DTAQQSTVPT SKANEILASV KATTLGVSSD SPGTTTLAQQ VSGPVNTTVA RGGGSGNPTT 120 

70 TIBSPKSTKS ADTTTVATST ATAKPNTTSS QNGAKDTTNS GGKSSHSVTT DLTSTKAEHL 180 

TTPHPTSPLS PRQPTLTHPV ATPTSSGHDH LMKISSSSST VAIPGYTFTS PGMTTTLPSS 240 

VISQRTQOTS SQMPASSTAP SSQETVQPTS PATALRTPTL PETMSSSPTA ASTTHRYPKT 300 

PSPTVAHESN WAKCEDLETQ TQSEKQLVLN LTGNTLCAGG ASDEKLISLI CRAVKATPNP 360 

AQDKOGIRLA SVPGSOTWV KEITIHTKLP AKDVYERLKD KWDEUCEAGV SDMKLGDQGP 420 

75 PEEAEDRPSM PLIITIVCMA SPLLLVAALY GCCHQRLSQR KDQQRLTEEL QTVENGYHDN 480 
PTLEVMETSS EMQEKKWSL NGELGDSWTV PLDNLTKDDXi DEEEDTHL 
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Seq ID NO: 15 Nucleotide sequence: 
Nucleic Acid Accession #: NM_004105 
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CTAGTATTCT ACTAGAACTG GAAGATTGCT CTCCGAGTTT TTTTTTTGTT ATTTTGTTAA 60 

AAAATAAAAA GCTTGAGCAG CAATTCATAT TACTGTCACA GGTATTTTTG CTGTGCTGTG 120 

CAAGGTAACT CTGCTAGCTA AGATTCACAA TGTTGAAAGC CCTTTTCCTA ACTATGCTGA 180 

CTCTGGCGCT GGTCAAGTCA CAGGACACCG AAGAAACCAT CACGTACACG CAATGCACTG 240 

ACGGATATGA GTGGGATCCT GTGAGACAGC AATGCAAAGA TATTGATGAA TGTGACATTG 300 

TCCCAGACGC TTGTAAAGGT GGAATGAAGT GTGTCAACCA CTATGGAGGA TACCTCTGCC 360 

TTCCGAAAAC AGCCCAGATT ATTGTCAATA ATGAACAGCC TCAGCAGGAA ACACAACCAG 420 

CAGAAGGAAC CTCAGGGGCA ACCACCGGGG TTGTAGCTGC CAGCAGCATG GCAACCAGTG 480 

GAGTGTTGCC CGGGGGTGGT TTTGTGGCCA GTGCTGCTGC AGTCGCAGGC CCTGAAATGC 540 

AGACTGGCCG AAATAACTTT GTCATCOGGC GGAACCCAGC TGACCCTCAG CGCATTCCCT 600 

CCAACCCTTC CCACCGTATC CAGTGTGCAG CAGGCTACGA GCAAAGTGAA CACAACGTGT 660 

GCCAAGACAT AGACGAGTGC ACTGCAGGGA CGCACAACTG TAGAGCAGAC CAAGTGTGCA 720 

TCAATTTACG GGGATCCTTT GCATGTCAGT GCCCTCCTGG ATATCAGAAG OGAGGGGAGC 780 

AGTGCGTAGA CATAGATGAA TGTACCATCC CTCCATATTG CCACCAAAGA TGCGTGAATA 840 

CACCAGGCTC ATTTTATTGC CAGTGCAGTC CTGGGTTTCA ATTGGCAGCA AACAACTATA 900 

CCTGCGTAGA TATAAATGAA TGTGATGCCA GCAATCAATG TGCTCAGCAG TGCTACAACA 960 

TTCTTGGTTC ATTCATCTGT CAGTGCAATC AAGGATATGA GCTAAGCAGT GACAGGCTCA 1020 

ACTGTGAAGA CATTGATGAA TGCAGAACCT CAAGCTACCT GTGTCAATAT CAATGTGTCA 1080 

ATGAACCTGG GAAATTCTCA TGTATGTGCC CCCAGGGATA CCAAGTGGTG AGAAGTAGAA 1140 

CATGTCAAGA TATAAATGAG TGTGAGACCA CAAATGAATG CCGGGAGGAT GAAATGTGTT 1200 

GGAATTATCA TGGCGGCTTC CGTTGTTATC CACGAAATCC TTGTCAAGAT CCCTACATTC 1260 

TAACACCAGA GAACCGATGT GTTTGCCCAG TCTCAAATGC CATGTGCCGA GAACTGCCCC 1320 

AGTCAATAGT CTACAAATAC ATGAGCATCC GATCTGATAG GTCTGTGCCA TCAGACATCT 1380 

TCCAGATACA GGCCACAACT ATTTATGCCA ACACCATCAA TACTTTTCGG ATTAAATCTG 1440 

GAAATGAAAA TGGAGAGTTC TACCTACGAC AAACAAGTCC TGTAAGTGCA ATGCTTGTGC 1500 

TCGTGAAGTC ATTATCAGGA CCAAGAGAAC ATATCGTGGA CCTGGAGATG CTGACAGTCA 1560 

GCAGTATAGG GACCTTCCGC ACAAGCTCTG TGTTAAGATT GACAATAATA GTGGGGCCAT 1620 

TTTCATTTTA GTCTTTTCTA AGAGTCAACC ACAGGCATTT AAGTCAGCCA AAGAATATTG 1680 

TTACCTTAAA GCACTATTTT ATTTATAGAT ATATCTAGTG CATCTACATC TCTATACTGT 1740 

ACACTCACCC ATAACAAACA ATTACACCAT GGTATAAAGT GGGCATTTAA TATGTAAAGA 1800 

TTCAAAGTTT GTCTTTATTA CTATATGTAA ATTAGACATT AATCCACTAA ACTGGTCTTC 1860 

TTCAAGAGAG CTAAGTATAC ACTATCTGGT GAAACTTGGA TTCTTTCCTA TAAAAGTGGG 1920 

ACCAAGCAAT GATGATCTTC TGTGGTGCTT AAGGAAACTT ACTAGAGCTC CACTAACAGT 1980 

CTCATAAGGA GGCAGCCATC ATAACCATTG AATAGCATGC AAGGGTAAGA ATGAGTTTTT 2040 

AACTGCTTTG TAAGAAAATG GAAAAGGTCA ATAAAGATAT ATTTCTTTAG AAAATGGGGA 2100 

•TCTGCCATAT TTGTGTTGGT TTTTATTTTC ATATCCAGCC TAAAGGTGGT TGTTTATTAT 2160 

ATAGTAATAA ATCATTGCTG TACAACATGC TGGTTTCTGT AGGGTATTTT TAATTTTGTC 2220 

AGAAATTTTA GATTGTGAAT ATTTTGTAAA AAACAGTAAG CAAAATTTTC CAGAATTCCC 2280 

AAAATGAACC AGATACCCCC TAGAAAATTA TACTATTGAG AAATCTATGG GGAGGATATG 2340 

AGAAAATAAA TTCCTTCTAA ACCACATTGG AACTGACCTG AAGAAGCAAA CTCGGAAAAT 2400 

ATAATAACAT CCCTGAATTC AGGCATTCAC AAGATGCAGA ACAAAATGGA TAAAAGGTAT 2460 

TTCACTGGAG AAGTTTTAAT TTCTAAGTAA AATTTAAATC CTAACACTTC ACTAATTTAT 2520 

AACTAAAATT TCTCATCTTC GTACTTGATG CTCACAGAGG AAGAAAATGA TGATGGTTTT 2580 

TATTCCTGGC ATCCAGAGTG ACAGTGAACT TAAGCAAATT ACCCTCCTAC CCAATTCTAT 2640 

GGAATATTTT ATACGTCTCC TTGTTTAAAA TCTGACTGCT TTACTTTGAT GTATCATATT 2700 
TTTAAATAAA AATAAATATT CCTTTAGAAG ATCACTCTAA AA 
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MLKALFLTML TLALVKSQDT EETITYTQCT DGYEWDPVRQ QCKDIDECDI VPDACKGGMK 60 

CVNHYGGYLC LPKTAQIIVN NEQPQQETQP ABGTSGATTG WAASSMATS GVLPGGGFVA 120 

SAAAVAGPEM QTGRNNFVIR RNPADPQRIP SNPSHRIQCA AGYEQSEHNV CQDIDECTAG 180 

THNCRADQVC INLRGSFACQ CPPGYQKRGE QCVDIDECTI PPYCHQRCVN TPGSPYCQCS 240 

PGFQLAANNY TCVDINECDA SNQCAQQCYN ILGSFICQCN QGYELSSDRL NCEDIDSCRT 300 

SSYLCQYQCV NEPGKFSCMC PQGYQWRSR TCQDINECET TNECREDEMC WNYHGGFRCY 360 

PRNPCQDPYI LTPENRCVCP VSNAMCRELP QSIVYKYMSI RSDRSVPSDI FQIQATTIYA 420 

NTHNTFRIKS GNBNGEFYLR QTSPVSAMLV LVKSLSGPRB HIVDLEMLTV SSIGTFRTSS 480 
VLRLTIIVGP FSF 



Seq ID NO: 17 Nucleotide sequence: 
Nucleic Acid Accession #: NM_018894 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



i 
I 

AAAACATTCA 
AGATAAAACC 
AAAGACACAT 
TGAGTGGGAT 
CGCTTGTAAA 
AACAGCCCAG 
AACCTCAGGG 
GCCCGGGGGT 
CCGAAATAAC 
TTCCCACCGT 
CATAGACGAG 
ACGGGGATCC 
AGACATAGAT 
CTCATTTTAT 
AGATATAAAT 
TTCATTCATC 
AGACATTGAT 
TGGGAAATTC 
AGATATAAAT 
TCATGGCGGC 
AGAGAACCGA 
AGTCTACAAA 
ACAGGCCACA 
AAATGGAGAG 
GTCATTATCA 
AGGGACCTTC 
TTAGTCTTTT 
AAAGCACTAT 
CCCATAACAA 
TTTGTCTTTA 
GAGCTAAGTA 
AATGATGATC 
GGAGGCAGCC 
TTGTAAGAAA 
TATTTGTGTT 
TAAATCATTG 
TTAGATTGTG 
ACCAGATACC 
AAATTCCTTC 
CATCCCTGAA 
GAGAAGTTTT 
ATTTCTCATC 
GGCATCCAGA 
TTTATACGTC 
AAAAATAAAT 



11 
I 

ACAAATTAAT 
AATAGAGTGC 
GGCCTTCTTT 
CCTGTGAGAC 
GGTGGAATGA 
ATTATTGTCA 
GCAACCACCG 
GGTTTTGTGG 
TTTGTCATCC 
ATCCAGTGTG 
TGCACTGCAG 
TTTGCATGTC 
GAATGTACCA 
TGCCAGTGCA 
GAATGTGATG 
TGTCAGTGCA 
GAATGCAGAA 
TCATGTATGT 
GAGTGTGAGA 
TTCCGTTGTT 
TGTGTTTGCC 
TACATGAGCA 
ACTATTTATG 
TTCTACCTAC 
GGACCAAGAG 
CGCACAAGCT 
CTAAGAGTCA 
TTTATTTATA 
ACAATTACAC 
TTACTATATG 
TACACTATCT 
TTCTGTGGTG 
ATCATAACCA 
ATGGAAAAGG 
GGTTTTTATT 
CTGTACAACA 
AATATTTTGT 
CCCTAGAAAA 
TAAACCACAT 
TTCAGGCATT 
AATTTCTAAG 
TTCGTACTTG 
GTGACAGTGA 
TCCTTGTTTA 
ATTCCTTTAG 



21 
I 

GGGTGTAAGG 
AGAATAAGAC 
GTGTACATGA 
AGCAATGCAA 
AGTGTGTCAA 
ATAATGAACA 
GGGTTGTAGC 
CCAGTGCTGC 
GGCGGAACCC 
CAGCAGGCTA 
GGACGCACAA 
AGTGCCCTCC 
TCCCTCCATA 
GTCCTGGGTT 
CCAGCAATCA 
ATCAAGGATA 
CCTCAAGCTA 
GCCCCCAGGG 
CCACAAATGA 
ATCCACGAAA 
CAGTCTCAAA 
TCCGATCTGA 
CCAACACCAT 
GACAAACAAG 
AACATATCGT 
CTGTGTTAAG 
ACCACAGGCA 
GATATATCTA 
CATGGTATAA 
TAAATTAGAC 
GGTGAAACTT 
CTTAAGGAAA 
TTGAATAGCA 
TCAATAAAGA 
TTCATATCCA 
TGCTGGTTTC 
AAAAAACAGT 
TTATACTATT 
TGGAACTGAC 
CACAAGATGC 
TAAAATTTAA 
ATGCTCACAG 
ACTTAAGCAA 
AAATCTGACT 
AAGATCACTC 



31 
I 

AACTGGAAAA 
TCAAGTCAAG 
CATGCATTCT 
AGATATTGAT 
CCACTATGGA 
GCCTCAGCAG 
TGCCAGCAGC 
TGCAGTCGCA 
AGCTGACCCT 
CGAGCAAAGT 
CTGTAGAGCA 
TGGATATCAG 
TTGCCACCAA 
TCAATTGGCA 
ATGTGCTCAG 
TGAGCTAAGC 
CCTGTGTCAA 
ATACCAAGTG 
ATGCCGGGAG 
TCCTTGTCAA 
TGCCATGTGC 
TAGGTCTGTG 
CAATACTTTT 
TCCTGTAAGT 
GGACCTGGAG 
ATTGACAATA 
TTTAAGTCAG 
GTGCATCTAC 
AGTGGGCATT 
ATTAATCCAC 
GGATTCTTTC 
CTTACTAGAG 
TGCAAGGGTA 
TATATTTCTT 
GCCTAAAGGT 
TGTAGGGTAT 
AAGCAAAATT 
GAGAAATCTA 
CTGAAGAAGC 
AGAACAAAAT 
ATCCTAACAC 
AGGAAGAAAA 
ATTACCCTCC 
GCTTTACTTT 
TAAAA 



41 
I 

CCTGGACTCC 
TAAGTAACGT 
CAACAATGCA 
GAATGTGACA 
GGATACCTCT 
GAAACACAAC 
ATGGCAACCA 
GGCCCTGAAA 
CAGCGCATTC 
GAACACAACG 
GACCAAGTGT 
AAGCGAGGGG 
AGATGCGTGA 
GCAAACAACT 
CAGTGCTACA 
AGTGACAGGC 
TATCAATGTG 
GTGAGAAGTA 
GATGAAATGT 
GATCCCTACA 
CGAGAACTGC 
CCATCAGACA 
CGGATTAAAT 
GCAATGCTTG 
ATGCTGACAG 
ATAGTGGGGC 
CCAAAGAATA 
ATCTCTATAC 
TAATATGTAA 
TAAACTGGTC 
CTATAAAAGT 
CTCCACTAAC 
AGAATGAGTT 
TAGAAAATGG 
GGTTGTTTAT 
TTTTAATTTT 
TTCCAGAATT 
TGGGGAGGAT 
AAACTCGGAA 
GGATAAAAGG 
TTCACTAATT 
TGATGATGGT 
TACCCAATTC 
GATGTATCAT 



Seq ID NO: 18 Protein sequence: 
Protein Accession #: NP_061489.1 



MHSQQCTDGY 
PQQETQPAEG 
ADPQRIPSNP 
GYQKRGEQCV 
CAQQCYNILG 
YQWRSRTCQ 
AMCRELPQSI 
PVSAMLVLVK 



11 
I 

EWDPVRQQCK 
TSGATTGWA 
SHRIQCAAGY 
DIDECTIPPY 
SPICQCNQGY 
DINECBTTNE 
VYKYMSIRSD 
SLSGPREHIV 



21 
I 

DIDECDIVPD 
ASSMATSGVL 
EQSEHNVCQD 
CHQRCVNTPG 
ELSSDRLNCE 
CREDEMCWNY 
RSVPSDIPQI 
DLEMLTVSSI 



31 
I 

ACKGGMKCVN 
PGGGFVASAA 
IDECTAGTHN 
SFYCQCSPGP 
D IDE CRTS SY 
HGGPRCYPRN 
QATTIYANTI 
GTFRTSSVLR 



41 

I 

HYGGYLCIiPK 
AVAGPEMQTG 
CRADQVCINL 
QLAANNYTCV 
LCQYQCVNEP 
PCQDPYILTP 
NTFRIKSGNB 
IiTIIVGPFSF 



51 
I 

TACCACATGC 
TAAACACCAT 
CTGACGGATA 
TTGTCCCAGA 
GCCTTCCGAA 
CAGCAGAAGG 
GTGGAGTGTT 
TGCAGACTGG 
CCTCCAACCC 
TGTGCCAAGA 
GCATCAATTT 
AGCAGTGCGT 
ATACACCAGG 
ATACCTGOGT 
ACATTCTTGG 
TCAACTGTGA 
TCAATGAACC 
GAACATGTCA 
GTTGGAATTA 
TTCTAACACC 
CCCAGTCAAT 
TCTTCCAGAT 
CTGGAAATGA 
TGCTCGTGAA 
TCAGCAGTAT 
CATTTTCATT 
TTGTTACCTT 
TGTACACTCA 
AGATTCAAAG 
TTCTTCAAGA 
GGGACCAAGC 
AGTCTCATAA 
TTTAACTGCT 
GGATCTGCCA 
TATATAGTAA 
GTCAGAAATT 
CCCAAAATGA 
ATGAGAAAAT 
AATATAATAA 
TATTTCACTG 
TATAACTAAA 
TTTTATTCCT 
TATGGAATAT 
ATTTTTAAAT 



51 
I 

TAQIIVNNEQ 
RNNFVIRRNP 
RGSFACQCPP 
DINECDASNQ 
GKFSCMCPQG 
ENRCVCPVSN 
NGEFYLRQTS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1600 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 



60 
120 
180 
240 
300 
360 
420 



Seq ID NO: 19 Nucleotide sequence: 
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ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 

TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 120 

CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 180 

AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 240 

75 TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 300 

TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 



187 



WO 02/079492 



PCT/US02/04915 



GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGOG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 480 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT 600 

5 CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCOG ACAGAAAAAG 780 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACCGCGTG GAAATCAGGT 840 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

10 GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 1200 

15 TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1500 

20 TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGCGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG CCGTGCAGGC 1800 

25 GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCOGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

30 GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTCCA CGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

35 GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

40 TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC ' AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

AOGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

45 GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

50 AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATCGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG OCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA COT 

Seq ID NO: 20 Protein sequence: 
Protein Accession #: NP_006491 

60 

1 11 21 31 41 51 

I I I I I I 

MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVBV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKBKR TLIFRVRQGQ GQSEPGEYEQ RLSLQDRGAT LALTQVTPQD ERIFLCQGKR 120 

65 PRSQBYRIQIi RVYKAPEEPN IQVNPDGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP 180 

LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRE 240 

VTVPVFYPTE KVWLEVEPVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREAEEETTN 300 

DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS UuSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRB AGGGYRCVAS VPSIPGLNRT 420 

70 QLVKLAIFGP PWKAFKERKV WVKENMVIiNL SCBASGHPRP TISWNVNGTA SBQDQDPQRV 480 

LSTLNVLVTP ELLETGVECT ASNDLGKNTS ILPLBLVNLT TLTPDSNTTT GLSTSTASPH 540 

TRANSTSTER KLPEPESRGV VIVAVIVCIL VLAVLGAVLY PLYKKGKLPC RRSGKQEITL 600 
PPSRKTELW EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YTDLRH 

75 
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Seq ID NO: 21 Nucleotide sequence: 
Nucleic Acid Accession #: NM_002421 

Coding sequence: 72.. 1481 (underlined sequences correspond to Btart and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GGGATATTGG AGTAGCAAGA GGCTGGGAAG CCATCACTTA CCTTGCACTG AGAAAGAAGA 60 

CAAAGGCCAG TATGCACAGC TTTCCTCCAC TGCTGCTGCT GCTGTTCTGG GGTGTGGTGT 120 

CTCACAGCTT CCCAGCGACT CTAGAAACAC AAGAGCAAGA TGTGGACTTA GTCCAGAAAT 180 

ACCTGGAAAA ATACTACAAC CTGAAGAATG ATGGGAGGCA AGTTGAAAAG CGGAGAAATA 240 

GTGGCCCAGT GGTTGAAAAA TTGAAGCAAA TGCAGGAATT CTTTGGGCTG AAAGTGACTG 300 

GGAAACCAGA TGCTGAAACC CTGAAGGTGA TGAAGCAGCC CAGATGTGGA GTGCCTGATG 360 

TGGCTCAGTT TGTCCTCACT GAGGGGAACC CTCGCTGGGA GCAAACACAT CTGACCTACA 420 

GGATTGAAAA TTACACGCCA GATTTGCCAA GAGCAGATGT GGACCATGCC ATTGAGAAAG 480 

CCTTCCAACT CTGGAGTAAT GTCACACCTC TGACATTCAC CAAGGTCTCT GAGGGTCAAG 540 

CAGACATCAT GATATCTTTT GTCAGGGGAG ATCATCGGGA CAACTCTCCT TTTGATGGAC 600 

CTGGAGGAAA TCTTGCTCAT GCTTTTCAAC CAGGCCCAGG TATTGGAGGG GATGCTCATT 660 

TTGATGAAGA TGAAAGGTGG ACCAACAATT TCAGAGAGTA CAACTTACAT CGTGTTGCGG 720 

CTCATGAACT CGGCCATTCT CTTGGACTCT CCCATTCTAC TGATATCGGG GCTTTGATGT 780 

ACCCTAGCTA CACCTTCAGT GGTGATGTTC AGCTAGCTCA GGATGACATT GATGGCATCC 840 

AAGCCATATA TGGACGTTCC CAAAATCCTG TCCAGCCCAT CGGCCCACAA ACCCCAAAAG 900 

CGTGTGACAG TAAGCTAACC TTTGATGCTA TAACTACGAT TCGGGGAGAA GTGATGTTCT 960 

TTAAAGACAG ATTCTACATG CGCACAAATC CCTTCTACCC GGAAGTTGAG CTCAATTTCA 1020 

TTTCTGTTTT CTGGCCACAA CTGCCAAATG GGCTTGAAGC TGCTTACGAA TTTGCCGACA 1080 

GAGATGAAGT CCGGTTTTTC AAAGGGAATA AGTACTGGGC TGTTCAGGGA CAGAATGTGC 1140 

TACACGGATA CCCCAAGGAC ATCTACAGCT CCTTTGGCTT CCCTAGAACT GTGAAGCATA 1200 

TCGATGCTGC TCTTTCTGAG GAAAACACTG GAAAAACCTA CTTCTTTGTT GCTAACAAAT 1260 

ACTGGAGGTA TGATGAATAT AAACGATCTA TGGATCCAGG TTATCCCAAA ATGATAGCAC 1320 

ATGACTTTCC TGGAATTGGC CACAAAGTTG ATGCAGTTTT CATGAAAGAT GGATTTTTCT 1380 

ATTTCTTTCA TGGAACAAGA CAATACAAAT TTGATCCTAA AACGAAGAGA ATTTTGACTC 1440 

TCCAGAAAGC TAATAGCTGG TTCAACTGCA GGAAAAATTG AACATTACTA ATTTGAATGG 1500 

AAAACACATG GTGTGAGTCC AAAGAAGGTG TTTTCCTGAA GAACTGTCTA TTTTCTCAGT 1560 

CATTTTTAAC CTCTAGAGTC ACTGATACAC AGAATATAAT CTTATTTATA CCTCAGTTTG 1620 

CATATTTTTT TACTATTTAG AATGTAGCCC TTTTTGTACT GATATAATTT AGTTCCACAA 1680 

ATGGTGGGTA CAAAAAGTCA AGTTTGTGGC TTATGGATTC ATATAGGCCA GAGTTGCAAA 1740 

GATCTTTTCC AGAGTATGCA ACTCTGACGT TGATCCCAGA GAGCAGCTTC AGTGACAAAC 1800 

ATATCCTTTC AAGACAGAAA GAGACAGGAG ACATGAGTCT TTGCCGGAGG AAAAGCAGCT 1860 

CAAGAACACA TGTGCAGTCA CTGGTGTCAC CCTGGATAGG CAAGGGATAA CTCTTCTAAC 1920 
ACAAAATAAG TGTTTTATGT TTGGAATAAA GTCAACCTTG TTTCTACTGT TTT 



Seq ID NO: 22 Protein sequencer 
Protein Accession #: NP_002412 



1 11 21 31 41 51 

I I I I I I 

MHSFPPLLIiL LFWGWSHSF PATIiETQEQD VDLVQKYLEK YYNLKNDGRQ VEKRRNSGPV 60 

VEKLKQMQEF FGLKVTGKPD AETLKVMKQP RCGVPDVAQF VLTEGNPRWE QTHLTYRIEN 120 

YTPDLPRADV DHAIEKAFQL WSNVTPLTFT" KVSEGQADIM ISFVRGDHRD NSPFDGPGGN 180 

LAKAFQPGPG IGGDAHFDED ERWTNNFREY NLHRVAAHEL GHSLGLSHST DIGALMYPSY 240 

TPSGDVQLAQ DDIDGIQAIY GRSQNPVQPI GPQTPKACDS KLTFDAITTI RGEVMFFKDR 300 

PYMRTNPFYP EVELNFISVF WPQLPNGLEA AYEFADRDBV RFFKGNKYKA VQGQNVLHGY 360 

PKDIYSSFGP PRTVKHIDAA LSEENTGKTY FPVANKYWRY DEYKRSMDPG YPKMIAHDFP 420 
GIGHKVDAVF MKDGFFYFFH GTRQYKFDPK TKRILTLQKA NSWFNCRKN 



Seq ID NO: 23 Nucleotide sequence: 

Nucleic Acid Accession #: PGENESH predicted ORF 

Coding sequence: 141-1580 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

TCTGCGTGTG CCGGGGCTAG GGGCTGGAAG 
AGGCAAACAG AGGAGGGAAG GCGTCTTAGG 
CTCTACAGGC CTGTGTCGCT ATGGGTTCCC 
TCCGCGAGTT CACTCGCCAC TCCTCCGACG 
GCGGGATCCT CACTGAOGTC ACGCTGCTGG 
CAGTTCTCAT CGCCTGCAGT GGCTTCTTCT 
GGGTGGACGT GCTCTCTCTG CCCGGGGGTC 
ACTTCATGTA CACTTCGCGC CTGCGCCTCT 
CCGCCACCTA TTTGCAGATG GAGCACGTGG 
GCTATGAACC TCTGGGCATC TCCCTGCGCC 
OGGCCCCTCC ACCAGGTAGT CCCAGGOGCT 



31 41 51 

III 

TCCTGGCTCT AGTTGCACCT CGGAAGGAAA 60 

ACTGCCTGGA TCCAGAGCAC TTTCCTCGGC 120 

CCGCCGCCCC GGAGGGAGCG CTGGGCTACG 180 

TGCTGGGCAA CCTCAACGAG CTGCGCCTGC 240 

TTGGCGGGCA ACCCCTCAGA GCACACAAGG 300 

ATTCAATTTT COGGGGCCGT GCGGGAGTCG 360 

CCGAAGCGAG AGGCTTCGCC CCTCTATTGG 420 

CTCCAGCCAC TGCACCAGCA GTCCTAGCGG 480 

TCCAGGCATG CCACCGCTTC ATCCAGGCCA 540 

CCCTGGAAGC AGAACCCCCA ACACCCCCAA 600 

CCGAAGGACA CCCAGACCCA CCTACTGAAT 660 
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CTCGAAGCTG CAGTCAAGGC CCCCCCAGTC 
GGAAAAAGTA CAAGTACATC GTGCTAAACT 
GGGAGAGAAG TTCTGGTCAA CCTTGCCCCC 
CCAGCAGCAG CAGCAGCAGC AGCAGCAGCA 
GCAGGCTCTC TCCAACTGCT GCCACTGTGC 
CCTACCTCCT CACATCCCAG GCTCAAGACA 
CACTACCGGG AAGTGAATTT TTCAGCTGCC 
CGGGGCTGGA CTCCTTGGTT CCTGGGGACG 
GGTCTTCGTT CCGCTACAAG GGCAACCTTG 
AGCCTTACCA CTGCTCAATC TGCGGAGCCC 
ACAGCOGCAT CCATTCGGGA GAGAAGCCGT 
TACAGGTGGC ACATCTGCGG GCGCACGTGC 
GCCCTACCTG CGGAACCCGC TTCCGCCACC 
ACACCGGAGA GAAGCCTTAC CACTGCGACC 
AACTGCGGCT GCATCTGCGC CAGAAACACG 
ACATTCTCGG GGGGCCCTAG CTGAGCGCAG 
GAAAGCTGCA GGCCCAGGCC TTGCTTCCCT 
CACTTTGGTA TCAGAAATTG CCACCCTCTT 
GATCCTGGCT AGATCTGCCT CTGTTTTGCT 
GTTTCTGAGG AGAGAGCTAG CTAGGGGCTG 
CCTAAGGGAA TAGCCCTCCA CCTGTGGCCC 
TTTATTGAGG CCTTTGGGTG GCACCGGGGC 
TTCCACAAGT GTGATTAAAA GTGACCAGAA 
CAGAGATTAC TAGCCCTTGG CTCTCTOGTT 
TAACTTTTAT CTTTAGAATT GTTCTTTCTC 
TGGAAAAAGG GGTTCTCTGT GTTCTGCCCC 
TCTAGGGCAG CTCTGGGAAC ATGCGGGATT 
TTCTGGATGT TGTAGGTTCT CTAGCAGTCT 
CAAGGGTGAT AGGAACCATT ATGTTGAGCC 
GAGGCTGTGG GTGTGGGGGA TTCTGTATCT 
GGGTGTGGGG GATTCTGTAT CTGGATTCCG 
TCTGCAAGAT GGTCCAGAAT CTAAAATGTC 
GCTGTATCCA TCTATAGTGG TAGAGACCCA 
CACGGGGGCC TGTTCTTAGC ACTGAGTTGA 
TTATCAGAGA TGATGTGACC TTTTCTGACT 
GGGAAGAATC ATGAAACTCT TTAGCTTGAT 
ACTACAGAGG CATATGGGTT TGAATGTTAC 
TCTTCCTTTA GTGGGTTTTG GACATCTTCT 
TCCTCTAGAA GGGATGGTGC TTGGTAACCT 
TCTTCCCATC CCTGCATTCC TGTCTGGAAC 
AAGAAAAGGG GCTGAGTTCC ATTCTGGGTT 
ATTACAGATG TAAAAGATTG ACTAGCCCAT 
TTCAAGTAGG ATTAAGAGGT TGGTTGAGGG 
GAAAGTGAGG AACAGGGTTG CCTCTTGGCT 
GCTGAAGCCT TGATTGATAG TTCTGCCCCT 
GGAGGGTAGA AAGTAAGAAG CACTTTTGAA 
GTTCTAGTGG CTGTCGCCTG GGGACTAGTG 
TCTCCCCATG GCCCCACTGC AGAATTAAAG 
AGAAGGAATC ATGATTTCTA TTTAGCAGAT 
AGAAATGTTA GATCTTGCAA CATCAGATCC 
AAAAAAAAAA AAAAAA 



CAGCCAGCCC TGACCCCAAG GCCTGCAACT 720 

CTCAGGCCTC CCAAGCAGGG AGCCTGGTCG 780 

AAGCCAGGCT CCCCAGTGGA GACGAGGCCT 840 

GTGAAGAAGG ACCCATTCCT GGTCCCCAGA 900 

AGTTCAAATG TGGGGCTCCA GCCAGTACCC 960 

CCTCTGGATC ACCCTCTGAA CGGGCTCGTC 1020 

AGAACTGTGA GGCTGTGGCA GGGTGCTCAT 1080 

AAGACAAACC CTATAAGTGT CAGCTGTGCC 1140 

CCAGTCATCG TACAGTGCAC ACAGGGGAAA 1200 

GTTTTAACCG GCCAGCAAAC CTGAAAACGC 1260 

ATAAGTGTGA GACGTGCGGC TCGCGCTTTG 1320 

TGATCCACAC CGGGGAGAAG CCCTACCCTT 1380 

TGCAGACCCT CAAGAGCCAC GTTCGCATCC 1440 

CCTGTGGCCT GCATTTCCGG CACAAGAGTC 1500 

GAGCTGCTAC CAACACCAAA GTGCACTACC 1560 

GCCCAGGCCC CACTTGCTTC CTGCGGGTGG 1620 

ATCAGGCTTG GGCATAGGGG TGTGCCAGGC 1680 

AATTTCTCAC TGGGGAGAGC AGGGGTGGCA 1740 

GGTCAAAACC TCTTCCCCAC AAGCCAGATT 1800 

GGAAAGGGGA GAGATTGGAG TCCTGGTCTC 1860 

CCATTGCATT CAGTTTATCT GTAAATATAA 1920 

CTTCATTCGA TTGCATTTCC CACTCCCCTC 1980 

ACACAGAAGG TGAGATCACA GCTCTGCTGG 2040 

TGGCTTGGGT ATTTTATATT ATTTCTGTCA 2100 

CTGTTTGTTT GCTTGTTAGT TTGTTTAAAA 2160 

TGTAATTCTA GGTCTGGAAC CTTTATTTGT 2220 

GTGGAATTGG GTCAGGAACC CTCTCTGGTA 2280 

AGAAATGGAT ACAGACATTT CTCTGTTCTT 2340 

CAAAATGGAA GTAATAATAA ATGCCTCCTG 2400 

GGATTCCGTA TCACTCCAAC TGGAGGCTGT 2460 

TATCACTCCA AGTGGAGGCT GGCAGGTTTT 2520 

CCATTAATCT GGTCACTTGG GTTTGGCTCT 2580 

CCAGGGCTCA AGTGGAGTCC ATCATCCTCC 2640 

TCGCTCCATG GGGGAGAGAT CAGACATTCC 2700 

CTGCCCAGTC TCTATGAATG TTATGGCCTA 2760 

TAGATGGTAA ACAGTGTTAA CCCATCCTTT 2820 

CTGGGGTTCT CTCTATTGAG TTGAGCCCCT 2880 

GGCAAGTGTC CAGATGCCAG AACCTTCTTT 2940 

TACCTTTTAA AAGCTGGGTC TGTGACCTGG 3000 

CAGTGAATGC ATTAGAACCT TCCATAGGAA 3060 

TGCTGTAGTT TGGTTGGGAT TATTGTTGGC 3120 

AGGCCAAAGG CCTGTTCTAG TTGACCAAGT 3180 

GTGCAGTTTC TGGTGTAGGC CAGGTAGGTA 3240 

GGGTGGAGTC TCTGAAATGT TAGAAGAAGC 3300 

TGTTGCCCTG GGGCTTATCT GATTATGGGA 3360 

TTTGTGGGGT AGAACTTCAA CAATAAGTCA 3420 

AGAAAGCTAC TCTTCTCCCT CTTCCCTCTT 3480 

AAGGAAGAAG GGAAGGCGGA GGAGTCTATA 3540 

TGGATGGGCA GGTGGAGAAT GCCTGGGGGT 3600 

TTGGAATAAA GAAGCCTCTC TGYGCWRAAA 3660 



Seq ID NO: 24 Protein sequence: 
Protein Accession #: FGENESH predicted 



1 11 21 31 41 51 

I I I I I I 

MGSFAAPBGA LGYVREFTRH SSDVLGNLNE LRLRGILTDV TLLVGGQPLR AHKAVLIACS 60 

GFFYSIFRGR AGVGVDVLSL PGGPEARGFA PLLDPMYTSR LRLSPATAPA VLAAATYLQM 120 

EHWQACHRF IQASYEPLGI SLRPLEAEPP TPPTAPPPGS PRRSEGHPDP PTESRSCSQG 1B0 

PPSPASPDPK ACNWKKYKYI VLNSQASQAG SLVGERSSGQ PCPQARLPSG DEASSSSSSS 240 

SSSSEEGPIP GPQSRLSPTA ATVQFKCGAP ASTPYLLTSQ AQDTSGSPSE RARPIiPGSBF 300 

PSCQNCEAVA GCSSGLDSLV PGDEDKPYKC QLCRSSFRYK GNLASHRTVH TGEKPYHCSI 360 

OGARFNRPAN LKTHSRIHSG EKPYKCETCG SRFVQVAHIxR AHVLIHTGEK PYPCPTCGTR 420 
FRHLQTLKSH VRIHTGEKPY HCDPCGLHFR HKSQLRLHLR QKHGAATOTK VHYHILGGP 



Seq ID NO: 25 Nucleotide sequence: 
Nucleic Acid Accession fh U21551 

Coding sequence: 1..1155 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I i I I 

ATGGATTGCA GTAACGGATC GGCAGAGTGT ACCGGAGAAG GAGGATCAAA AGAGGTGGTG 60 
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GGGACTTTTA AGGCTAAAGA CCTAATAGTC 
GACCCCAATA ATCTGGTTTT TGGAACTGTG 
TCCTCAGAGT TTGGATGGGA GAAACCTCAT 
CCTGGCTCAT CAGCTTTGCA CTATGCAGTG 
GGAGTAGATA ATAAAATTCG ACTGTTTCAG 
TCTGCTGTGA GGGCAACTCT GCCGGTATTT 
CAGCTTGTGA AATTGGATCA AGAATGGGTC 
CGTCCTGCAT TCATTGGAAC TGAGCCTTCT 
CTCTTTGTAC TCTTGAGCCC AGTCGGACCT 
TCCCTGTGGG CCAATCCCAA GTATGTAAGA 
ATGGGAGGGA ATTACGGCTC ATCTCTTTTT 
CAGGAGGTCC TGTGGCTCTA TGGCAGAGAC 
CT T TT TCTTT ACTGGATAAA TGAAGATGGA 
GGCATCATTC TTCCAGGAGT GACAAGGCGG 
GAATTTAAGG TGTCAGAGAG ATACCTCACC 
AACAGAGTGA GAGAGATGTT TAGCTCTGGT 
ATACTGTACA AAGGCGAGAC AATACACATT 
AGCCGCATCT TGAGCAAATT AACTGATATC 
ATTGTGCTAT C CTGA 



ACACCAGCTA CCATTTTAAA GGAAAAACCA 120 

TTCACGGATC ATATGCTGAC GGTGGAGTGG 180 

ATCAAGCCTC TTCAGAACCT GTCATTGCAC 240 

GAATTATTTG AAGGATTGAA GGCATTTCGA 3 00 

CCAAACCTCA ACATGGATAG AATGTATCGC 360 

GACAAAGAAG AGCTCTTAGA GTGTATTCAA 420 

CCATATTCAA CATCTGCTAG TCTGTATATT 480 

CTTGGAGTCA AGAAGCCTAC CAAAGCCCTG 540 

TATTTTTCAA GTGGAACCTT TAATCCAGTG 600 

GCCTGGAAAG GTGGAACTGG GGACTGCAAG 660 

GCCCAATGTG AAGACGTAGA TAATGGGTGT 720 

CATCAGATCA CTGAAGTGGG AACTATGAAT 780 

GAAGAAGAAC TGGCAACTCC TCCACTAGAT 840 

TGCATTCTGG ACCTGGCACA TCAGTGGGGT 900 

ATGGATGACT TGACAACAGC CCTGGAGGGG 960 

ACAGCCTGTG TTGTTTGCCC AGTTTCTGAT 1020 

CCAACTATGG AGAATGGTCC TAAGCTGGCA 1080 

CAGTATGGAA GAGAAGAGAG CGACTGGACA 1140 



Seq ID NO: 26 Prptein sequence: 
Protein Accession #: AAB08528 

1 11 21 31 41 51 

I I I I I I 

MDCSNGSAEC TGEGGSKEW GTFKAKDLIV TPATILKEKP DPNNLVFGTV FTDHMLTVEW 60 
SSEFGWBKPH IKPLQNLSLH PGSSALHYAV EIiFEGLKAFR GVDNKIRLFQ PNLNMDRMYR 120 
SAVRATLPVP DKEELLECIQ QLVKLDQEWV PYSTSASLYI RPAFIGTEPS LGVKKPTKAL 180 
LFVIiLSPVGP YFSSGTFNPV SLWANPKYVR AWKGGTGDCK MGGNYGSSLF AQCEDVDNGC 240 
QQ VLWLYGRD HQITEVGTMN LFLYWINEDG EEELATPPLD GIILPGVTRR CILDLAHQWG 300 
EFKVSERYLT MDDLTTALEG NRVREMFSSG TACWCPVSD ILYKGETIHI PTMENGPKLA 360 
SRILSKLTDI QYGREESDWT IVLS 



Seq ID NO: 27 Nucleotide sequence: 
Nucleic Acid Accession #: XM_039209 

Coding sequence: 656.. 2758 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

TCGCGCGGGG GCCGCCCCCT CCCCTTCCCT CCACCCTGGG CGGGGGCGCG CGAGAAGCGG 60 

TGACGTCAAG GGGCGCGCTG TGGCAGCACC TCCCCGCGCG CTAGTTAAAA AGAAGAAGAA 120 

AAGAGGGAAC GAAACATGAG AGGCTGTGTG AGAAGCTGCA GCCGCCGGCA GAGGAGACCT 180 

CAGCATCATC TAGAGCCCAG CGCTGGCCCT GCCTCCGCCT GCCCCGCCGC CGCCGTCGCC 240 

GTTTCTGTTC CTGCTACTGT CCCACCTAAA CAACTCCCGT TACACGGACA AGTGAACATC 300 

TGTGGCTGTC CPCTCCTTTT CTTCCTCCTC TTCCAACTCC TTCTCCTCCT CCCACTTCCC 360 

AGCCGCAGCA GAAAGCCCCC AACCCAACTG ACACTGGCAC AACTGCAAAC GGTGTCATCC 420 

GCACAACTTT ATCTCGCTCC TCGGGCTCCC CTAAGGCATT GGACCCATCG CCGCGTCTTT 480 

TATTTTTTGC AAAGTTGCAT CGCTGTACAT ATTTTTGTCC CCGCCACCTC CCTCTGTCTC 540 

TGGAGTGCCC TACAGCCCCG CAAACTCCTC CTGGAGCTGC GCCCTAGTGC CCCTGCTGGG 600 

CAGTGGCGTT CCCCCCCATC CTCCCGCGCC CAGCCCCTGC TGCTCTGGGC AGACGATGCT 660 

GAAGATGCTC TCCTTTAAGC TGCTGCTGCT GGCCGTGGCT CTGGGCTTCT TTGAAGGAGA 720 

TGCTAAGTTT GGGGAAAGAA ACGAAGGGAG CGGAGCAAGG AGGAGAAGGT GCCTGAATGG 780 

GAACCCCCCG AAGCGCCTGA AAAGGAGAGA CAGGAGGATG ATGTCCCAGC TGGAGCTGCT 840 

GAGTGGGGGA GAGATGCTGT GCGGTGGCTT CTACCCTCGG CTGTCCTGCT GCCTGCGGAG 900 

TGACAGCCCG GGGCTAGGGC GCCTGGAGAA TAAGATATTT TCTGTTACCA ACAACACAGA 960 

ATGTGGGAAG TTACTGGAGG AAATCAAATG TGCACTTTGC TCTCCACATT CTCAAAGCCT 1020 

GTTCCACTCA CCTGAGAGAG AAGTCTTGGA AAGAGACCTA GTACTTCCTC TGCTCTGCAA 1080 

AGACTATTGC AAAGAATTCT TTTACACTTG CCGAGGCCAT ATTCCAGGTT TCCTTCAAAC 1140 

AACTGCGGAT GAGTTTTGCT TTTACTATGC AAGAAAAGAT GGTGGGTTGT GCTTTCCAGA 1200 

TTTTCCAAGA AAACAAGTCA GAGGACCAGC ATCTAACTAC TTGGACCAGA TGGAAGAATA 1260 

TGACAAAGTG GAAGAGATCA GCAGAAAGCA CAAACACAAC TGCTTCTGTA TTCAGGAGGT 1320 

TGTGAGTGGG CTGCGGCAGC CCGTTGGTGC CCTGCATAGT GGGGATGGCT CGCAACGTCT 1380 

CTTCATTCTG GAAAAAGAAG GTTATGTGAA GATACTTACC CCTGAAGGAG AAATTTTCAA 1440 

GGAGCCTTAT TTGGACATTC ACAAACTTGT TCAAAGTGGA ATAAAGGGAG GAGATGAAAG 1500 

AGGACTGCTA AGCCTCGCAT TCCATCCCAA TTACAAGAAA AATGGAAAGT TGTATGTGTC 1560 

CTATACCACC AACCAAGAAC GGTGGGCTAT CGGGCCTCAT GACCACATTC TTAGGGTTGT 1620 

GGAATACACA GTATCCAGAA AAAATCCACA CCAAGTTGAT TTGAGAACAG CCAGAGTCTT 1680 

TCTTGAAGTT GCAGAACTCC ACAGAAAGCA TCTGGGAGGA CAACTGCTCT TTGGCCCTGA 1740 

CGGCTTTTTG TACATCATTC TTGGTGATGG GATGATTACA CTGGATGATA TGGAAGAAAT 1800 

GGATGGGTTA AGTGATTTCA CAGGCTCAGT GCTACGGCTG GATGTGGACA CAGACATGTG 1860 

CAAOGTGCCT TATTCCATAC CAAGGAGCAA CCCACACTTC AACAGCACCA ACCAGCCCCC 1920 

CGAAGTGTTT GCTCATGGGC TCCACGATCC AGGCAGATGT GCTGTGGATA GACATCCCAC 1980 

TGATATAAAC ATCAATTTAA OGATACTGTG TTCAGACTCC AATGGAAAAA ACAGATCATC 2040 
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AGCCAGAATT CTACAGATAA TAAAGGGGAA 
ATTCAAGCCA TTCAGTAATG GTCCTTTGGT 
AGAAAGATTG TATGGAAGCT ACGTGTTTGG 
GCAAAGTCCT GTGACAAAGC AGTGGCAAGA 
CTGTAGAGGC TACTTTTCCG GTCACATCTT 
TTACATTTTA TCAAGCAGTA AAAGTATGAC 
TGTAGATCCC AAAAGACCTT TAATGCCTGA 
GACACTGACT TCAGAGTGCT CCAGGCTCTG 
GTGCTGCTGC AGTCCAGGCT GGGAGGGGGA 
ATGTCGTCAT GGAGGTGTCT GTGTTAGACC 
TGGTCCTCAA TGTGAACAAG TGGACAGAAA 
TGATCAGATC ATTGACATGA CATCTTACTT 
TCTGGGACTG TTTGAATATT CTATTCCAAT 
AAAAAAGACT GTTATCCTGC TACACACTCC 
AATAATTTCC AGAAATGTGC AGATCCTCTG 
CACATACACA TACTCATAAC CCCTATATGC 
ATATACTTCC TTATGCAAAG TAATTTACAC 
TTTTATGTTA CTAGAAGAGA TTATTTGACT 
AGTCAACTTT AATAGAGTTT TGAAACAGTA 
AAGGCAATAT TTTTATATTA AAGTACTATA 
GAATTTCTAA GTGAGCAACT TGATATAAAA 
GTTACAGAAT GCTACACACT TACCTTTTTA 
ATCTCAAGAT TGTTTTCAAG TGTTTTATAA 
TCTTCCTAAA AGGTCTGCTT TTATTGTATA 
TTACATATTT ATATATTTTA TTTTATTTTT 

Seg ID NO: 28 Protein aeouence; 
Protein Accession #: XP_039209 



AGATTATGAA AGTGAGCCAT CACTTTTAGA 2100 

TGGTGGATTT GTATACCGGG GCTGCCAGTC 2160 

AGATCGTAAT GGGAATTTCC TAACTCTCCA 2220 

AAAACCACTC TGTCTCGGCA CTAGTGGGTC 2280 

GGGATTTGGA GAAGATGAAC TAGGTGAAGT 2340 

CCAGACTCAC AATGGAAAAC TCTACAAAAT 2400 

GGAATGCAGA GCCACGGTAC AACCTGCACA 2460 

TCGAAACGGC TACTGCACCC CCACGGGAAA 2520 

CTTCTGCAGA ACTGCAAAAT GTGAGCCAGC 2580 

GAACAAGTGC CTCTGTAAAA AAGGATATCT 2640 

CATCCGCAGA GTGACCAGGG CAGGTATTCT 2700 

GCTGGATCTA ACAAGTTACA TTGT ATAGT T 2760 

GGGCATTTAT TTTTTATCCT GTCATTAAAA 2820 

TGTGATTTCA TTCTCTTTTA TTAATTTAAA 2880 

TGTGTATGTC AGCATGTTTG TTCACATATG 2940 

GTTGTTGCAT AACAGATGAT TTTTTAAAAT 3000 

AGAAATTCCA TTGTAAATTG ATAATGGATT 3060 

TCCCAGGAAT TTTCTGTCTG TAATCACTAA 3120 

CTGTGCAATC CGATGGATCT AATTAAAAAA 3180 

CTAGGAGAGA ATGTTTCAGA ACTCCCTGAT 3240 

TTGTAATCTT CATTTTTGTC AGTGTATCCA 3300 

TTGGCTGAGA AATCTGGTTA TTTCATCTTA 3360 

TTAAATCATA ATAGCATATT TTAAAATCAA 3420 

TTTTATTTAA CAATAGGCAC TGGGTTTGTG 3480 
ATAATATAGA CATCACCTAG 



MLKMLSFKLL 
LLSGGEMLCG 
SLFHSPBREV 
PDFPRKQVRG 
RLFILEKEGY 
VSYTTNQERW 
PDGFLYIILG 
PPEVFAHGLH 
LBFKPFSNGP 
GSCRGYFSGH 
AQTLTSBCSR 
YLGPQCBQVD 



11 

I 

LLAVALGFFE 
GFYPRLSCCL 
LERDLVLPLL 
PASNYLDQME 
VKILTPEGEI 
AIGPHDHILR 
DGMITIiDDME 
DPGRCAVDRH 
LVGGFVYRGC 
ILGFGEDELG 
LCRNGYCTPT 
RNIRRVTRAG 



21 
I 

GDAKFGERNE 
RSDSPGLGRL 
CKDYCKEFFY 
EYDKVEEISR 
FKEPYLDIHK 
WBYTVSRKN 
EMDGLSDFTG 
PTDININLTI 
QSERLYGSYV 
EVYILSSSKS 
GKCCCSPGWE 
ILDQIIDMTS 



31 
I 

GSGARRRRCL 
ENKIFSVTNN 
TCRGHIPGFL 
KHKHNCFCIQ 
LVQSGIKGGD 
PHQVDLRTAR 
SVLRLDVDTD 
LCSDSNGKNR 
FGDRNGNFLT 
MTQTHNGKLY 
GDFCRTAKCE 
YLLDLTSYIV 



41 
I 

NGNPPKRLKR 
TECGKLLEEI 
QTTADEFCFY 
EWSGLRQPV 
ERGLLSLAFH 
VFLEVAELHR 
MCNVPYSIPR 
SSARILQIIK 
LQQSPVTKQW 
KIVDPKRPLM 
PACRHGGVCV 



51 
I 

RDRRMMSQLE 
KCALCSPHSQ 
YARKDGGLCF 
GALHSGDGSQ 
PNYKKNGKLY 
KHLGGQLLFG 
SNPHFNSTNQ 
GKDYESEPSL 
QEKPLCLGTS 
PEECRATVQP 
RPNKCLCKKG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



Seq ID NO: 29 Nucleotide sequence: 
Nucleic Acid Accession 8: NM_024756 

Coding sequence: 75.. 2924 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



AAGACAACGT 
GCCCCACCAC 
GGCTGCTGGG 
GGACACCTGG 
ACTGGTGCCC 
AATTCCTCAT 
AAGTCATGTA 
CTTTGGCCTG 
TGGCAATCCC 
TCAGCTTCAA 
AGCAGGAACA 
CAGGCCTGTG 
CAGGGCACGA 
TCCTACAAGT 
CCCAGGCCAT 
TCCAGGACAG 
AGGTCCAGGA 
ACGCCCAGCA 
AATTGAAGAG 
CAACGCCTGG 
TGCAGAGGAA 
ACACCCTGGA 
ACTCCGAATC 
TGCAGGTGAA 



CACTAGCAGT 
CAAGATGATC 
GGCATGGGCC 
GGTCTGGAAG 
CTACCCAATG 
CCACTCGCAG 
CCGCATGGCC 
GAGGTGCTGC 
TGAGCCTGCA 
ACCTGGCCAC 
TCTGCTGGGA 
GAAAGCCCTG 
GTTCCCTGAT 
GCATTTCAGC 
AAGAAACCTG 
TGCCGTGGCC 
GAACACTCAG 
CTTTACCCTG 
GCTGCACAAG 
GGCTGGGGCA 
CCTCTCAGAG 
GGACATGAGG 
GGACGAGACT 
CCACACGGCG 



TTCTGGAGCT 
CTGAGCTTGC 
CAGGCTTCCA 
GCAGAGGCTG 
TCCAAGCTGG 
CAGCCGTGTC 
CACAAGCCAG 
CCTGGCTACA 
GATCCTGGTG 
CTTGCTGCAG 
GATCTCCAGA 
CCTGGTAACC 
AGATCCTTGG 
CCCATCTGGA 
TCTCTTGACG 
AGGGCTGACT 
AGAGTGGGTC 
CACCGCTCGA 
GCTCAGGAGG 
AGGCCTGAGC 
CTGCACATGA 
GCCACCCTGA 
TTCGATCAGA 
CTCCGTGAGC 



ACTTGCCAAG 
TGTTCAGCCT 
GTACTAGCCT 
AGGACACCAG 
TCACCTTACT 
CGCAGGGAGC 
TGTACCAGGT 
CGGGCCCCAA 
ACAGCCACCA 
TGATCAATGA 
ATGATGTGCA 
TCACAGCTGC 
AGCAGGTGCT 
GGAGCTTTAA 
TGGAGGCCAA 
TCCAGGAGCT 
AGCTGCGACA 
TCTCAGAGCT 
CCCCAGGGAC 
CGGACAGCCT 
CCACGGCCCG 
CCOGGCACGT 
TTAGCAAGGT 
TGCGCGTGAT 



GCTGAGTGTG 
TGGGGGCCCC 
CTCTGATCTG 
CAAGGACCCC 
AGCTCTTTGC 
TCCAGACTGC 
CAAGCAGAAG 
CTGCGAGCAC 
GGAACCTCAG 
GGTTGAGGTG 
CCGGGTGGCA 
AGTGATGGAA 
GCTACCCCAC 
CCAAAGCCTG 
CCGCCAGGCC 
TGGTGCCAAA 
GGACGTGGAG 
CCAAGCCGAT 
CAATGGCAGT 
GCAGGCCAGG 
CAGGGAGGAG 
GGATGAGATC 
GGAGCGGCAG 
CCTGATGGAG 



AGCTGAGCCT 
CTGGGCTGGG 
CAGAGCTCCA 
GTTGGACGTA 
AAAACAGAGA 
CAGAAAGTCA 
GTGCTGACCT 
CACGATTCCA 
GATGGACCAG 
CAACAGGAAC 
GACAGCCTGC 
GCAAATCAAA 
GTGGACACCT 
CACAGCCTTA 
ATCTCCAGAG 
TTTGAGGCCA 
GACCGCCTGC 
GTGGACACCA 
CTGGTGTTGG 
CTGGGCCAGC 
GAGTTGCAGT 
AAGGAACTGT 
GTGGAGGAGC 
AAGTCTCTGA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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TCATGGAGGA GAACAAGGAG GAGGTGGAGC GGCAGCTCCT GGAGCTCAAC CTCACGCTGC 1500 

AGCACCTGCA GGGTGGCCAT GCCGACCTCA TCAAGTACGT GAAGGACTGC AATTGCCAGA 1560 

AGCTCTATTT AGACCTGGAC GTCATCCGGG AGGGCCAGAG GGACGCCACG CGTGCCCTGG 1620 

AGGAGACCCA GGTGAGCCTG GACGAGCGGC GGCAGCTGGA CGGCTCCTCC CTGCAGGCCC 1680 

TGCAGAACGC CGTGGACGCC GTGTCGCTGG CCGTGGACGC GCACAAAGCG GAGGGOGAGC 1740 

GGGCGCGGGC GGCCACGTCG CGGCTCCGGA GCCAAGTGCA GGCGCTGGAT GACGAGGTGG 1800 

GCGCGCTGAA GGCGGCCGCG GCCGAGGCCC GCCACGAGGT GOGCCAGCTG CACAGCGCCT 1660 

TCGCCGCCCT GCTGGAGGAC GOGCTGCGGC ACGAGGCGGT GCTGGCCGCG CTCTTCGGGG 1920 

AGGAGGTGCT GGAGGAGATG TCTGAGCAGA OGCCGGGAOC GCTGCCCCTG AGCTACGAGC 1980 

AGATCCGCGT GGCCCTGCAG GACGCCGCTA GCGGGCTGCA GGAGCAGGCG CTCGGCTGGG 2040 

ACGAGCTGGC CGCCCGAGTG ACGGCCCTGG AGCAGGCCTC GGAGCCCCCG CGGCCGGCAG 2100 

AGCACCTGGA GCCCAGCCAC GACGCGGGCC GCGAGGRGGC CGCCACCACC GCCCTGGCCG 2160 

GGCTGGCGCG GGAGCTCCAG AGCCTGAGCA ACGACGTCAA GAATGTCGGG CGGTGCTGCG 2220 

AGGCCGAGGC CGGGGCCGGG GCCGCCTCCC TCAAOGCCTC CCTTGACGGC CTCCACAACG 2280 

CACTCTTCGC CACTCAGCGC AGCTTGGAGC AGCACCAGCG GCTCTTCCAC AGCCTCTTTG 2340 

GGAACTTCCA AGGGCTCATG GAAGCCAACG TCAGCCTGGA CCTGGGGAAG CTGCAGACCA 2400 

TGCTGAGCAG GAAAGGGAAG AAGCAGCAGA AAGACCTGGA AGCTCCCOGG AAGAGGGACA 2460 

AGAAGGAAGC GGAGCCTTTG GTGGACATAC GGGTCACAGG GCCTGTGCCA GGTGCCTTGG 2520 

GCGCGGCGCT CTGGGAGGCA GGATCCCCTG TGGCCTTCTA TGCCAGCTTT TCAGAAGGGA 2580 

CGGCTGCCCT GCAGACAGTG AAGTTCAACA CCACATACAT CAACATTGGC AGCAGCTACT 2640 

TCCCTGAACA TGGCTACTTC CGAGCCCCTG AGCGTGGTGT CTACCTGTTT GCAGTGAGCG 2700 

TTGAATTTGG CCCAGGGCCA GGCACCGGGC AGCTGGTGTT TGGAGGTCAC CATCGGACTC 2760 

CAGTCTGTAC CACTGGGCAG GGGAGTGGAA GCACAGCAAC GGTCTTTGCC ATGGCTGAGC 2820 

TGCAGAAGGG TGAGCGAGTA TGGTTTGAGT TAACCCAGGG ATCAATAACA AAGAGAAGCC 2880 

TGTCGGGGAC TGCATTTGGG GGCTTCCTGA TGTTTAAGAC CTGAACCCCA GCCCCAATCT 2940 

GATCAGACAT CATGGACTCG CCCAGCTCTC CTCGGCCTGG GGCTCTGGCC AAGGATGGGC 3000 

TGGAGGTCAT TCAGTTGGTC TGTCTCTTCC CTGGAAACCT TCTGCAAAGA TGGTGTGGTG 3060 

TACGTGGCTT CCCTGTAACC ACATGGGGCT TGGCCATTTC TCCATGATGA GAAGGACTGG 3120 

AATGCTTCTC CGGGCAGGAC ATGGTCCTAG GAAGCCTGAA CCTTGGCTTG GCATGCCTTC 3180 

TCAGACAGCA CGGCCTGGGC TCCAACTCTT CACCACACCC TGTATTCTAC AACTTCTTTG 3240 

GTGTTTTGCT CCTCCTGTGG TTGGAAACTT CTGTACAACA CTTTAAACTT TTCTCTTGCT 3300 

TCCTCTTCTC TTCTCCCTTA TCGTATGATA GAAAGACATT CTTCCCCAGG AGGAATGTTT 3360 

AAAATGGAGG CAACATTTTG GCCAACATTG GAAAGCACTA GAGGGCAATG GGATTAAACC 3420 

AACCTGCTTG GTCTCTATTA GTCAGTAATG AAGACGACAG CCTGGCCAAC CAAGGGAAAG 3460 

GAAATTAGTA TCTTTAGTTT CAGTCATTCC TTGTAGGATA TGGTTTAGCT GTGCCCCCAC 3540 

CTAAAATATC ATCTTGAATT GTAATCCCTA TAATCCCCAC ATCAAGGGAG AGATCAGGTG 3600 

GAGGTAATTG GATCTTGGGG GCGGTTCCCC CATGCTGTTC TTGTGATAGT TCTCACGAGA 3660 

TCTGATGATT TTATAAGTTT GATAGTTCCT CCTGTGTTCA TTCTCCTTCC TGCCACCTTG 3720 

TGAAGATGCC TTGGTTCCTC TTCACTGTCT GCCATGATTG TAAGTTTCCT GAGGCCTCCC 3780 
CAGCCATGTG GAACAGTGAG TCAATTAAAC CTCTTTCCTT TATAAATT 



Seq ID NO: 30 Protein sequence: 
Protein Accession #: NP_079032 

1 11 21 31 41 51 

I I I I I I 

MILSLLFSLG GPLGWGLLGA WAQASSTSLS DLQSSRTPGV WKABAEDTSK DPVGRNWCPY 
PMSKLVTLLA LrCKTEKFLIH SQQPCPQGAP DCQKVKVMYR KAHKPVYQVK QKVLTSLAWR 
CCPGYTGPNC EHHDSMAIPE PADPGDSHQE PQDGPVSFKP GHLAAVTNBV EVQQBQQEHL 
LGDLONDVHR VADSLPGLWK ALPGNLTAAV MEANQTGHEP PDRSLEQVLL PHVDTFLQVH 
FSPIWRSFNQ SLHSLTQAIR NLSLDVEANR QAISRVQDSA VARADFQELG AKFEAKVQEN 
TQRVGQLRQD VEDRLHAQHF TLHRSISELQ ADVDTKLKRL HKAQEAPGTN GSLVLATPGA 
GARPEPDSLQ ARLGQLQRNL SELHMTTARR EEELQYTLED KRATLTRKVD EIKELYSESD 
ETFDQISKVE RQVEELQVNH TALRELRVIL MEKSLIMEEN KEEVERQLLE LNLTLQHLQG 
GHADLIKYVK DCNCQKLYLD LDVIREGQRD ATRAIiEETQV SLDERRQLDG SSLQALQNAV 
DAVSLAVDAH KAEGERARAA TSRLRSOVQA LDDEVGALKA AAAEARHEVR QLRSAFAALL 
EDALRHEAVL AALPGEEVLE EMSEQTPGPL PLSYEQIRVA LQDAASGLQB QAIiGWDEIAA 
RVTALEQASE PPRPAEHLEP SHDAGREEAA TTALAGLARE IiQSLSKDVKN VGRCCEAEAG 
AGAASLNASL DGLHNALFAT QRSLEQHQRL FHSLFGNFQG LMEANVSLDL GKLQTMLSRK 
GKKQQKDLEA PRXRDKKEAE PLVDIRVTGP VPGALGAALW EAGSPVAFYA SFSEGTAALQ 
TVKFUTTYIN IGSSYFPEHG YFRAPERGVY LFAVSVEFGP GPGTGQLVFG GHHRTPVCTT 
GQGSGSTATV FAMAELQKGE RVWFELTQGS ITKRSLSGTA FGGFLMFKT 

Seq ID NO: 31 Nucleoti de Rf>gn*Tinf> ? 
Nucleic Acid Accession #: AB037715 
Coding sequence: 370.. 3489 {underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GAACGCTGAC AGAACAGGCA GTGCAATTCC ATGTTCCTCT TAAGTATGTT AGCCCTACCG 60 

GGAGCTGAGC TGGCCAGTCT ACTTGGAGAG GAAAAGTAGA TCTGGGGAAG GTGGAAGGGT 120 

CAGTTCCTAA GTGACTTCCT CCTCGGGGAT GGTAAGGGCA TTTGCTGATC TCCAGTGACT 180 

GCCTGGTGCC TCATGGTCAG ACTCGGCTGT CTCACTCCCA GATATCTGAT TTTGCAAAAA 240 

GGGACACACC TATCTGCAGC AAAGAAGACA CTGACCAGAT TGCGAGCGGT GCTTTTGGAT 300 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
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GCTCTGTAGC CACCCGGGGC CCAGGAGGAC 
TCGGAGACCA TGGCAGTGCA GCTGGTGCCC 
GAGGGCCGCC GATGTCAAGT ACATCTTCTT 
CCCAAGCTGT TGGCCAAGGA GCTTCTTGAC 
5 AAGGAGTACT TTGGAATAGC ATTCACAGAT 
GATCGAAGAG TATTGGAACA TGACTTCCCT 
TGTGTCAGGT TCTATATAGA AAGCATTTCA 
TTCTTTCTGA ACGCGAAGTC CTGCATCTAC 
GTGTTTGAAT TAGCTTCCTA TATTTTACAG 

10 GTTGTGAGGA GTGACTTGAA GAAGCTGCCA 
CCTTCCCTGG CCTACTGTGA AGACAGAGTC 
ACAAGAGGTC AAGCAATCGT AAACTACATG 
GTTCACTATT ATGCAGTGAA GGACAAGCAG 
AAAGGGATCT TCCAGTATGA CTACCATGAT 

15 AGACAGTTGG AAAACCTGTA CTTCAGAGAA 
CGCAGGGCTT CAGTGACAAG GAGGACGTTT 
TATGCATGTC CGGCATTGAT CAAGTCCATC 
TATCTGGACA GAAAGCAGAG TAAGTCCAAA 
GCCATCGACC TGACCGAGAC GGGGACGCTG 

20 AAGGGGAAGA TCATCAGCGG CAGCAGCGGC 
GATAGCTCGC AGTCGGCCAA GAAGGACATG 
CTGGAGGAAA CCCTGCGTCA GAGGCTGGAG 
GAGCTCACGG GCAAGCTGCC AGTAGAATAT 
GTTCGGAGAA GAATAGGAAC AGCCTTCAAA 

25 GAGGAAGCTG AGCTGGAACG CCTGGAACGA 
GCCGCCCGCC GCCTAGCCAG TGACCCCAAC 
ACCTCGTATC TGAATGCACT GAAGAAACTG 
CGCATCAAGT CTGGGAAGAA ACCCACCCAG 
ATTGCCAGTG AAGACAGCTC CCTCTCAGAT 

30 GTTACCAGCA CAATATCCCC CCTACATTCT 
TCGCACAACA GGCCTCCTCC TCCCCAGTCC 
CGCAACGACT ATGACAAGTC ACCCATCAAG 
GAACCCTATG AGAAGGTCAA GAAGCGCTCC 
TTCCCCAGCA CAGGAAGCTG TGCGGAAGCC 

35 CCCATCCGCG GCCTCCCGCA CTGGAACTCC 
CGGGTCCGGA GTCCCCACTA CGTCCATTCC 
CTGCACAGCC TCGCACTGCA CTTTAGGCAC 
CTCCTGGGCT CGGAAAACGA CACCGGGAGC 
AGCAACGGCT CAGACCCCAT GGACGACTGC 

40 CACTACTACC CGGCGCAGAT GAACGCCAAC 
AAGGCGCGCC AGAGGCAGAG GCAGCGGCAG 
TCGGGCAGCA TGCCCAACCT GGCGGCGCGC 
GGCGGTGTGT ACCTGCACAG CCAGAGCCAG 
CCGCTGTACA TCGAGGGCGG OGCCAOGCCC 

45 GAGTGCCACT ACAGCGTCAA GGCTCAGTTC 
CTGTTCAAGG AGAGCTGGCG CGGCGGCGGC 
CCGTCGCGAT CGCAGATCCT GCGGACTCCG 
GGCGCGGGCC GTGCCGCCGT CTCAGACGAG 
TCGCACAAGG AGCACAGCCG CCTGTCGCAC 

50 CAGTACAGCA CCTCCTCCCA GAGCACCTTC 
CAGATGTGCA AGGCCACGTC AGCTGCCTTA 
AGTGAAATTG GAGCCACCCC CCCAAGCAGC 
GAAGCAACAG AAAACTCACC CATTCTGGAT 
GATGAATAGA GGAGCTACAA TGATAGCTGT 

55 GCTGATGTCC AGTGGTACGG GCAGGAAAAA 
CCGGCCTAAT CTGACCGCCT CAACGCCATT 
TTACCCAGAC GCACCGTCAC CCTGCACCAG 
CTCCGCATTC CCTCCCCCTT GAAAACCTGA 
CACTGTGTGT CCCCTGGCGC TCTTGCCCAT 

60 • CTTGGTGGCT TCCCTCTGCC ATGACAGCCC 
GGCATCCAAT TCCTGCGGAT AAGTAGCGTT 
CAGGGTGACC CAGAAAGACG ATTCAGCTGT 
CAAGCACTTC ATGAAGAGGA GGCCTCGTGG 
GATGGGACAG CTTGTGGGGA TGGCTATGGG 

65 ACACCAGAAA TGCATCGGAG GACCACAATC 
TAAAAACATA AAAAATTAAG AGGGGCCAAG 
TTTTAAATTC TGAACTGCTA CTACACACAA 
CTCTCTCTAG CCCTCTCCCT TACTGGCCCA 
GCCCCAATGC CACGGTAAAG GCGAGGAAGT 

70 ATCCATCTGG ACACAAAGAG AGACCTGTGG 
CATGCAGGGG GTTCAGCCGA GCCCAAGACT 
AACGTAAGGT GATAATGGCC AAAAGTGGTT 
ATCCTATTTT TTTGCATAAG GTGTTTCATT 
ACATTGCGAT CCAITCAGTG TTTAACTGTC 

75 GTGACAAAAG AGCTCAGATC CGACTTCTCC 
TGCCCTTAGG TAGAAAGATT TGACTCGTGT 



TGACTCGGCA GCAGGATTCG TGCATGGGAA 360 

GACTCAGCTC TCGGCCTGCT GATGATGACG 420 

GATGACAGGA AGCTGGAACT CCTAGTACAG 480 

CTTGTGGCTT CTCACTTCAA TCTGAAGGAA 540 

GAAACGGGAC ACTTAAACTG GCTTCAGCTA 600 

AAAAAGTCAG GACCCGTGGT TTTATACTTT 660 

TACCTGAAGG ATAATGCTAC CATTGAGCTT 720 

AAGGAGCTTA TTGACGTTGA CAGCGAAGTG 780 

GAGGCAAAGG GAGATTTTTC TAGCAATGAA 840 

GCCCTTCCCA CCCAAGCCCT GAAGGAGCAC 900 

ATTGAGCACT ACAAGAAACT GAACGGTCAG 960 

AGCATCGTGG AGTCTCTCCC AACCTACGGG 1020 

GGCATACCAT GGTGGCTGGG CCTGAGCTAC 1080 

AAAGTGAAGC CAAGAAAGAT ATTCCAATGG 1140 

AAGAAGTTTT CCGTGGAAGT TCATGACCCA 1200 

GGGCACAGCG GCATTGCAGT GCACACGTGG 1260 

TGGGCTATGG CCATAAGCCA ACACCAGTTC 1320 

ATCCATGCAG CACGCAGCCT GAGTGAGATC 1380 

AAGACCTCGA AGCTGGCCAA CATGGGTAGC 1440 

AGCCTGCTGT CTTCAGGTTC TCAGGAATCA 1500 

CTGGCTGCCT TGAAGTCCAG GCAGGAAGCT 1560 

GAACTGAAGA AGCTGTGTCT CCGAGAAGCT 1620 

CCCCTGGATC CAGGGGAGGA ACCACCCATT 1680 

CTGGATGAAC AGAAAATCCT GCCCAAAGGA 1740 

GAGTTTGCCA TTCAGTCCCA GATTACGGAG 1800 

GTCAGCAAAA AACTGAAGAA ACAAAGGAAA 1860 

CAGGAGATTG AAAATGCAAT CAATGAGAAC 1920 

AGGGCTTCGC TGATCATAGA CGATGGAAAC 1980 

GCCCTTGTTC TTGAGGATGA AGACTCTCAG 2040 

CCTCACAAGG GACTCCCTCC TCGGCCACCG 2100 

CTGGAGGGAC TCCGACAGAT GCACTATCAC 2160 

CCCAAAATGT GGAGTGAGTC CTCTTTAGAT 2220 

TCTCACAGCC ATTCCAGCAG CCACAAGCGC 2280 

GGCGGAGGAA GCAACTCCTT GCAGAACAGC 2340 

CAGTCCAGCA TGCCGTCCAC GCCAGACCTG 2400 

ACGAGGTCGG TGGACATCAG CCCCACCCGA 2460 

CGGAGCTCCA GCCTGGAGTC CCAGGGCAAG 2520 

CCCGACTTCT ACACCCCGCG GACTCGTAGC 2580 

TCGTCGTGCA CCAGCCACTC GAGCTCGGAG 2640 

TACTCCACGC TGGCCGAGGA CTOGCCGTCC 2700 

CGGGCGGCGG GCGCACTGGG CTCAGCCAGC 2760 

GGGGGTGCGG GGGGCGCGGG GGGCGCGGGG 2820 

CCCAGCTCGC AGTACCGCAT CAAGGAGTAC 2880 

GTGGTGGTGC GCAGCCTGGA GAGCGACCAG 2940 

AAGACGTCCA ACTCCTACAC GGCGGGCGGC 3000 

GGCGACGAGG GCGACACGGG CCGCCTGACG 3060 

TCGCTGGGCC GCGAGGGCGC CCACGACAAG 3120 

CTGCGCCAGT GGTACCAGCG TTCCACCGCC 3180 

ACCAGCTCCA CCTCCTOGGA CAGCGGCTCG 3240 

GTGGCGCACA GCAGGGTCAC CAGGATGCCC 3300 

CCTCAAAGCC AGAGAAGCTC GACACCGTCA 3360 

CCCCACCACA TCCTAACCTG GCAGACTGGA 3420 

GGGTCTGAGT CTCCACCTCA CCAAAGTACT 3480 

TTCCTGGATT CCTCCCTCTA TCCAGAACTA 3540 

GCCAAGCCCG GGACCCTCGT GTGAGCCAGC 3600 

CTGAGATCAC CTCACTGCCT CTCATTTGCC 3660 

CTTTGGCCCT CAGCACTTTT TTTCTCCTGT 3720 

CTGAGGAGAC ATTCTGGAAG GTTCCGGTCC 3780 

AGAGAGCCAG ACACCAATCC TCAATGGCAC 3840 

CTAGGCCAGG AACCATCAGG GGGGCCAGCC 3900 

GGGAGAGAAC GGGAAAGGGG ACTTGGGTTA 3960 

GTCCAGCCTG CCACCCATAC GTAGGCCAAC 4020 

CATATTCAGT TTACACCTGA AATATTCCTT 4080 

GGAAGGGGAG GTTGAGAAAG GAAGTTCTCG 4140 

AGTTCTATGC TGCCAAAGAT TAAAAATAAA 4200 

AGGAAGACAT TCTTTCTGCA AGGAAATTTC 4260 

GTGAAAGTCA ACCCTATGTA AACTGGTGTC 4320 

CTTCTCTCTC CGTAGAGAGC CTGAAAAACT 4380 

CTTGGCTGGC GTTGCTGACT CACAGTCGCC 4440 

GAGTCATAGA GGGTACTGTT AGCCCCGGTC 4500 

CAAAGCTGCT TTCCTTTCAG GATTTGTAGT 4560 

CTCTCTCATT AAACCAACCA GTAAAAGCGT 4620 

TTCGTTTTTA TGGGAAACCA AGGGAAAAGC 4680 

GTGGCTCATT TTCTGTTCGT TAGCACTTGT 4740 

TATGTGTCAC TTATTCCAAG AACCCAACTA 4800 

GTCTACTAGC CAACAGGCAG AGCAGGGTTG 4860 
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AAAAAAATAT CAGCTCCCAA AGGGCCCATG 
ACATTTGTGT GCAGATACGA AAAGAGGAGG 
GCACGTTTAC ATGTTTTGAG CTATGCTTCA 
GGCCTCAAAA ATACTTTTAT AGTAACAAGT 
CACAAAAAGG TTTCCGCAGA GGTGGTATGC 
GGGGTGGGGG TGGAATTTTT TTCTCACTCT 
CCAGGGACAG GAGCCAGGGT GGGGGTAGTT 
AGCATAAAAA CAAAGAAAAA TCTTCGCTTT 
GGCTCTACCA GACCAGGAGG GTAAGGATGG 
CCTTCTGCTG CAGCCTGGAG ACCACCGAGA 
GGACCCGGCA GGGACAAGGC GGGCCGTGGC 
GGGCCTGCTT TCCCCAGCTC CATGCATGGC 
TTCATATTCC CAGAACGCTT TAAGTGTACA 
ATTAAATGAT TCTAGGGATT CACTGGGGGA 
AGCTACAAAG AACAGTGATT TTTTTTTTTT 
TGGGCCATTT TTCTTTCTCC CAAAGAAGAT 
CAGGAAAAGT CAAAAGGGAA AAGGCAGCTG 
ATGCAGAGTA GCTTGAAATC TAGTCTGGAG 
GTTGCAAGGA ATGAGAGGCA AAAATTCTAA 
CAGAGAGAAA TGGAGAGCAG GAATTACAGT 
AAGACAGAGA TTAAGTAAAA CAGGTTTTAC 
TACATAAAAC GTTAGTCCTT TGAGACTGAC 
TGTAGTTATT GTACACAAGC ACTTGCAAAC 
AGGTGAAATA CGAAGTCCTT GGTCTGATAT 
AAAGAAATTG CCTGTTTCAG CCAGAAGACT 
GAGCCAGGTT GATTTTTTAT TTTATTATAT 
AATTTGTTTT CATTCAGTAT TAGTTTAGTT 
TCAGATGACC AGTTACTGCT TAGTTAACTA 
AATAGTTTAT TACAAGTTGT GTAAAATGGA 
AGGTTGCTCC TGAAACTGAC TGTAGAGCAT 
TGTAATCAAT GAAAAAGATG TACGTTGTAG 
GCTTTTAATT TATTCTTTTT GTATTAAGAA 
AGTGTTGTCA ACACTTATTA AAGCATTTTC 



TGTCTACATC ATCAGTTACT GTCATGCACC 4920 

AAAGAAGAAA AAAATTAATG TGTGGGAGCT 4980 

AACACAACTG GAAAGCCATC AATCTTCAAA 5040 

GCACGACTTT AGTTGGGTTA TTCAAGATGG 5100 

TGTGCTTTTG GCGCAAGTGG TGGGGGGATG 5160 

AATGACTTCC TATTGGAAAG GCATTGACAG 5220 

TTGTGGGAAA GCAGAACTGA AGTTAGCTTA 5280 

TCATGTATGT GGAATCCAAG AATAACCATA 5340 

ACACTAAAAT GAAACAAATA CCAAGGTATT 5400 

GTCGAGCTGG GGCACACACA CACCTGGCCG 5460 

CTCCTCCACC AAGTOTCTCT AGACAATTCA 5520 

TGGACTGGTG ATTCCAGGGT GCAGAAGGGA 5580 

CCTGCAGGAT AAAGAGATAC CX3GTTACATT 5640 

TATTTTTGTT GCTTTTACTT TCATGGTTAG 5700 

CTCCCTTCCC CATTCAGAAA CATTATACAT 5760 

TCATGGATAG TCAGACTGAA CTGTGTGCAA 5820 

ATGAGGTTAC ATGGTTACAT GTTCTACATC 5880 

AAAACTGGAT CAAGATTCTA GCCCACTGGA 5940 

AGATTTGGGT TATATTTTCA ACTTGGGGGA 6000 

TCCAACAAAC ATCATGATAG TCTGGTAGTC 6060 

TGTTTAGCTG AGTTCAGTTA ATACAAAATG 6120 

ATGATTAATG ATCAGTGTGG TGGGAAATGA 6180 

TCTTTATCCC TATTTCTTTA AAAGAAAATA 6240 

AAAGCCCCTA TTGGATTCTT CGGATGCGTA 6300 

GGTGAAAACA CATACATCAG ACTATGTTGT 6360 

GCAGGTGAGT GTTGAAACTG TTAAAATTCC 6420 

CTAAATATAG CAAACCCCAT CCAGGTGCTA 6480 

GGTGTAAAGT TTTACATATA CATTAATTTC 6540 

CTCTAGTTTA ATAATGGGGG AAAAAAGATT 6600 

GTAAAATGAT TTTACTGGAT TCTGTTCAAC 6660 

ACAAAGTTGC AGAATTAAAA AAAGAAATCT 6720 

TTTGTATAGT ATCTTTACAT TTTGCAAAAC 6780 
AAAATG 



Seq ID NO: 32 Protein sequence: 
Protein Accession #: BAA92532 



1 11 21 31 41 51 

I I I I I I 

MAVQLVPDSA LGLLMMTEGR RCQVHLLDDR KLELLVQPKL LAKELLDLVA SHFNLKEKEY 60 

PGIAPTDETG HLNWLQLDRR VLEHDFPKKS GPWLYPCVR FYIESISYLK DNATIELFFL 120 

NAKSCIYKEL IDVDSEWFE LASYILQEAK GDFSSNEWR SDLKKLPALP TQALKEHPSL 180 

AYCEDRVIEH YKKLNGQTRG QAIVNYMSIV BSLPTYGVHY YAVKDKQGIP WWLGLSYKGI 240 

FQYDYHDKVK PRKIFQWRQL ENLYFREKKF SVEVHDPRRA SVTRRTFGHS GIAVHTWYAC 300 

PALIKSrWAM AISQHQFYLD RKQSKSKIHA ARSLSEIAID LTETGTLKTS KLANMGSKGK 360 

XISGSSGSLL SSGSQESDSS QSAKKDMLAA LKSRQEALEE TLRQRLEELK KLCLREAELT 420 

GKLPVEYPLD PGEEPPIVRR RIGTAFKLDE QKILPKGEEA ELERLEREFA IQSQITEAAR 480 

RLASDPNVSK KLKKQRKTSY LNALKKLQEI ENAINENRIK SGKKPTQRAS LIIDDGNIAS 540 

EDSSLSDALV LEDEDSQVTS TISPLHSPHK GLPPRPPSHN RPPPPQSLEG LRQMHYHRND 600 

YDKSPIKPKM WSESSLDEPY EKVKKRSSHS HSSSHKRFPS TGSCAEAGGG SNSLQNSPIR 660 

GLPHWNSQSS MPSTPDLRVR SPHYVHSTRS VDISPTRLHS LALHFRHRSS SLESOGKIiLG 720 

SENDTGSPDF YTPRTRSSNG SDPMDDCSSC TSHSSSEHYY PAQMHANYST LAEDSPSKAR 780 

QRQRQRQRAA GALGSASSGS MPNLAARGGA GGAGGAGGGV YLHSQSQPSS QYRIKEYPLY 840 

IEGGATPVW RSLESDQBCH YSVKAQFKTS NSYTAGGLFK ESWRGGGGDE GDTGRLTPSR 900 

SQILRTPSLG REGAHDKGAG RAAVSDELRQ WYQRSTASHK EHSRLSHTSS TSSDSGSQYS 960 

TSSQSTFVAH SRVTRMPQMC KATSAALPQS QRSSTPSSEI GATPPSSPHH ILTWQTGEAT 1020 
ENSPILDGSE SPPHQSTDE 



Seq ID NO: 33 Nucleotide sequence: 
Nucleic Acid Accession #: NM_014331 

Coding sequence: 1..1506 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATSGTCAGAA AGCCTGTTGT GTCCACCATC TCCAAAGGAG GTTACCTGCA GGGAAATGTT 60 

AACGGGAGGC TGCCTTCCCT GGGCAACAAG GAGCCACCTG GGCAGGAGAA AGTGCAGCTG 120 

AAGAGGAAAG TCACTTTACT GAGGGGAGTC TCCATTATCA TTGGCACCAT CATTGGAGCA 180 

GGAATCTTCA TCTCTCCTAA GGGCGTGCTC CAGAACACGG GCAGOGTGGG CATGTCTCTG 240 

ACCATCTGGA CGGTGTGTGG GGTCCTGTCA CTATTTGGAG CTTTGTCTTA TGCTGAATTG 300 

GGAACAACTA TAAAGAAATC TGGAGGTCAT TACACATATA TTTTGGAAGT CTTTGGTCCA 360 

TTACCAGCTT TTGTACGAGT CTGGGTGGAA CTCCTCATAA TACGCCCTGC AGCTACTGCT 420 

GTGATATCCC TGGCATTTGG AOGCTACATT CTGGAACCAT TTTTTATTCA ATGTGAAATC 480 
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CCTGAACTTG OGATCAAGCT CATTACAGCT GTGGGCATAA CTGTAGTGAT GGTCCTAAAT 540 

AGCATGAGTG TCAGCTGGAG CGCCCGGATC CAGATTTTCT TAACCTTTTG CAAGCTCACA 600 

GCAATTCTGA TAATTATAGT CCCTGGAGTT ATGCAGCTAA TTAAAGGTCA AACGCAGAAC 660 

TTTAAAGACG CGTTTTCAGG AAGAGATTCA AGTATTACGC GGTTGCCACT GGCTTTTTAT 720 

TATGGAATGT ATGCATATGC TGGCTGGTTT TACCTCAACT TTGTTACTGA AGAAGTAGAA 780 

AACCCTGAAA AAACCATTCC CCTTGCAATA TGTATATCCA TGGCCATTGT CACCATTGGC 840 

TATGTGCTGA CAAATGTGGC CTACTTTACG ACCATTAATG CTGAGGAGCT GCTGCTTTCA 900 

AATGCAGTGG CAGTGACCTT TTCTGAGCGG CTACTGGGAA ATTTCTCATT AGCAGTTCCG 960 

ATCTTTGTTG CCCTCTCCTG CTTTGGCTCC ATGAACGGTG GTGTGTTTGC TGTCTCCAGG X020 

TTATTCTATG TTGCGTCTCG AGAGGGTCAC CTTCCAGAAA TCCTCTCCAT GATTCATGTC 1080 

CGCAAGCACA" CTCCTCTACC AGCTGTTATT GTTTTGCACC CTTTGACAAT GATAATGCTC 1140 

TTCTCTGGAG ACCTCGACAG TCTTTTGAAT TTCCTCAG2T TTGCCAGGTG GCTTTTTATT 1200 

GGGCTGGCAG TTGCTGGGCT GATTTATCTT CGATACAAAT GCCCAGATAT GCATCGTCCT 1260 

TTCAAGGTGC CACTGTTCAT CCCAGCTTTG TTTTCCTTCA CATGCCTCTT CATGGTTGCC 1320 

CTTTCCCTCT ATTCGGACCC ATTTAGTACA GGGATTGGCT TCGTCATCAC TCTGACTGGA 1380 

GTCCCTGCGT ATTATCTCTT TATTATATGG GACAAGAAAC CCAGGTGGTT TAGAATAATG 1440 

TCAGAGAAAA TAACCAGAAC ATTACAAATA ATACTGGAAG TTGTACCAGA AGAAGATAAG 1500 

TTATGAACTA ATGGACTTGA GATCTTGGCA ATCTGCCCAA GGGGAGACAC AAAATAGGGA 1560 

TTTTTACTTC ATTTTCTGAA AGTCTAGAGA ATTACAACTT TGGTGATAAA CAAAAGGAGT 1620 

CAGTTATTTT TATTCATATA TTTTAGCATA TTCGAACTAA TTTCTAAGAA ATTTAGTTAT 1680 

AACTCTATGT AGTTATAGAA AGTGAATATG CAGTTATTCT ATGAGTCGCA CAATTCTTGA 1740 

GTCTCTGATA CCTACCTATT GGGGTTAGGA GAAAAGACTA GACAATTACT ATGTGGTCAT 1800 

TCTCTACAAC ATATGTTAGC ACGGCAAAGA ACCTTCAAAT TGAAGACTGA GATTTTTCTG 1860 

TATATATGGG TTTTGTAAAG ATGGTTTTAC ACACTACAGA TGTCTATACT GTGAAAAGTG 1920 

TTTTCAATTC TGAAAAAAAG CATACATCAT GATTATGGCA AAGAGGAGAG AAAGAAATTT 1980 

ATTTTACATT GACATTGCAT TGCTTCCCCT TAGATACCAA TTTAGATAAC AAACACTCAT 2040 

GCTTTAATGG ATTATACCCA GAGCACTTTG AACAAAGGTC AGTGGGGATT GTTGAATACA 2100 

TTAAAGAAGA GTTTCTAGGG GCTACTGTTT ATGAGACACA TCCAGGAGTT ATGTTTAAGT 2160 

AAAAATCCTT GAGAATTTAT TATGTCAGAT GTTTTTTCAT TCATTATCAG GAAGTTTTAG 2220 

TTATCTGTCA TTTTTTTT TT TCACATCAGT TTGATCAGGA AAGTGTATAA CACATCTTAG 2280 

AGCAAGAGTT AGTTTGGTAT TAAATCCTCA TTAGAACAAC CACCTGTTTC ACTAATAACT 2340 

TACCCCTGAT GAGTCTATCT AAACATATGC ATTTTAAGCC TTCAAATTAC ATTATCAACA 2400 

TGAGAGAAAT AACCAACAAA GAAGATGTTC AAAATAATAG TCCCATATCT GTAATCATAT 2460 

CTACATGCAA TGTTAGTAAT TCTGAAGTTT TTTAAATTTA TGGCTATTTT TACACGATGA 2520 

TGAATTTTGA CAGTTTGTGC ATTTTCTTTA TACATTTTAT ATTCTTCTGT TAAAATATCT 2580 

CTTCAGATGA AACTGTCCAG ATTAATTAGG AAAAGGCATA TATTAACATA AAAATTGCAA 2640 

AAGAAATGTC GCTGTAAATA AGATTTACAA CTGATGTTTC TAGAAAATTT CCACTTCTAT 2700 

ATCTAGGCTT TGTCAGTAAT TTCCACACCT TAATTATCAT TCAACTTGCA AAAGAGACAA 2760 

CTGATAAGAA GAAAATTGAA ATGAGAATCT GTGGATAAGT GTTTGTGTTC AGAAGATGTT 2820 

GTTTTGCCAG TATTAGAAAA TACTGTGAGC CGGGCATGGT GGCTTACATC TGTAATCCCA 2880 

GCACTTTGGG AGGCTGAGGG GGTGGATCAC CTGAGGTCGG GAGTTCTAGA CCAGCCTGAC 2940 

CAACATGGAG AAACCCCATC TCTACTAAAA ATACAAAATT AGCTGGGCAT GGTGGCACAT 3000 

GCTGGTAATC TCAGCTATTG AGGAGGCTGA GGCAGGAGAA TTGCTTGAAC CCGGGAGGCG 3060 

GAGGTTGCAG TGAGCCAAGA TTGCACCACT GTACTCCAGC CTGGGTGACA AAGTCAGACT 3120 
CCATCTCCAA AAAAAAAAAA AAAA 



Seq ID NO: 34 Protein sequence; 
Protein Accession #: NP_055146 



1 11 21 31 41 51 

I I I I I I 

MVRKFWSTI SKGGYLQGNV NGRLPSLGNK EPPGQEKVQL KRKVTLLRGV SIIIGTIIGA 60 

GIFISPKGVL QNTGSVGMSI* TIWTVCGVLS LFGALSYAEL GTTIKKSGGH YTYILEVFGP 120 

LPAFVRVWVE LLIIRPAATA VISIAFGRYI LEPFFIQCEI PELAIKLITA VGITWMVLN 180 

SMSVSWSARI QIFLTFCKLT AILIIIVPGV MQLIKGQTQN FKDAFSGRDS SITRLPLAFY 240 

YGMYAYAGWP YLNFVTEEVE NPBKTIPLAI CISMAIVTIG YVLTNVAYFT TTNAEELLLS 300 

NAVAVTFSER LLGNFSLAVP IFVALSCFGS MNGGVFAVSR LFYVASREGH LPEILSMIHV 360 

RKHTPLPAVI VLHPLTMIML FSGDLDSLLN FLSFARWLFI GLAVAGLIYL RYKCPDMHRP 420 

FKVPIiFIPAL FSFTCLFMVA LSLYSDPFST GIGFVTTLTG VPAYYLFIIW DKKPRWFRIM 480 
SEKITRTLQI ILEWPEEDK L 



Seq ID NO: 35 Nucleotide secmence: 
Nucleic Acid Accession #: NM_002422 

Coding sequence: 64.. 1497 (underlined sequences correspond to Btart and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ACAAGGAGGC AGGCAAGACA GCAAGGCATA GAGACAACAT AGAGCTAAGT AAAGCCAGTG 60 

GAAATGAAGA GTCTTCCAAT CCTACTGTTG CTGTGCGTGG CAGTTTGCTC AGCCTATCCA 120 

TTGGATGGAG CTGCAAGGGG TGAGGACACC AGCATGAACC TTGTTCAGAA ATATCTAGAA 180 

AACTACTACG ACCTCAAAAA AGATGTGAAA CAGTTTGTTA GGAGAAAGGA CAGTGGTCCT 240 

GTTGTTAAAA AAATCOGAGA AATGCAGAAG TTCCTTGGAT TGGAGGTGAC GGGGAAGCTG 300 
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GACTCCGACA CTCTGGAGGT GATGCGCAAG CCCAGGTGTG GAGTTCCTGA TGTTGGTCAC 360 

TTCAGAACCT TTCCTX3GCAT CCCGAAGTGG AGGAAAACCC ACCTTACATA CAGGATTGTG 420 

AATTATACAC CAGATTTGCC AAAAGATGCT GTTGATTCTG CTGTTGAGAA AGCTCTGAAA 480 

GTCTGGGAAG AGGTGACTCC ACTCACATTC TCCAGGCTGT ATGAAGGAGA GGCTGATATA 540 

ATGATCTCTT TTGCAGTTAG AGAACATGGA GACTTTTACC CTTTTGATGG ACCTGGAAAT 600 

GTTTTGGCCC ATGCCTATGC CCCTGGGCCA GGGATTAATG GAGATGCCCA CTTTGATGAT 660 

GATGAACAAT GGACAAAGGA TACAACAGGG ACCAATTTAT TTCTCGTTGC TGCTCATGAA 720 

ATTGGCCACT CCCTGGGTCT CTTTCACTCA GCCAACACTG AAGCTTTGAT GTACCCACTC 780 

TATCACTCAC TCACAGACCT GACTCGGTTC CGCCTGTCTC AAGATGATAT AAATGGCATT 840 

CAGTCCCTCT ATGGACCTCC CCCTGACTCC CCTGAGACCC CCCTGGTACC CACGGAACCT 900 

GTCCCTCCAG AACCTGGGAC GCCAGCCAAC TGTGATCCTG CTTTGTCCTT TGATGCTGTC 960 

AGCACTCTGA GGGGAGAAAT CCTGATCTTT AAAGACAGGC ACTTTTGGCG CAAATCCCTC 1020 

AGGAAGCTTG AACCTGAATT GCATTTGATC TCTTCATTTT GGCCATCTCT TCCTTCAGGC 1080 

GTGGATGCCG CATATGAAGT TACTAGCAAG GACCTCGTTT TCATTTTTAA AGGAAATCAA 1140 

TTCTGGGCCA TCAGAGGAAA TGAGGTACGA GCTGGATACC CAAGAGGCAT CCACACCCTA 1200 

GGTTTCCCTC CAACCGTGAG GAAAATCGAT GCAGCCATTT CTGATAAGGA AAAGAACAAA 1260 

ACATATTTCT TTGTAGAGGA CAAATACTGG AGATTTGATG AGAAGAGAAA TTCCATGGAG 1320 

CCAGGCTTTC CCAAGCAAAT AGCTGAAGAC TTTCCAGGGA TTGACTCAAA GATTGATGCT 1380 

GTTTTTGAAG AATTTGGGTT CTTTTATTTC TTTACTGGAT CTTCACAGTT GGAGTTTGAC 1440 

CCAAATGCAA AGAAAGTGAC ACACACTTTG AAGAGTAACA GCTGGCTTAA TTGTTGAAAG 1500 

AGATATGTAG AAGGCACAAT ATGGGCACTT TAAATGAAGC TAATAATTCT TCACCTAAGT 1560 

CTCTGTGAAT TGAAATGTTC GTTTTCTCCT GCCTGTGCTG TGACTCGAGT CACACTCAAG 1620 

GGAACTTGAG CGTGAATCTG TATCTTGCCG GTCATTTTTA TGTTATTACA GGGCATTCAA 1680 

ATGGGCTGCT GCTTAGCTTG CACCTTGTCA CATAGAGTGA TCTTTCCCAA GAGAAGGGGA ' 1740 

AGCACTCGTG TGCAACAGAC AAGTGACTGT ATCTGTGTAG ACTATTTGCT TATTTAATAA 1800 
AGACGATTTG TCAGTTGTTT T 



Seq ID NO: 36 Protein sequence: 
Protein Accession #: NP_002413 



1 11 21 31 41 51 

I I I I I I 

MKSLPILLLL CVAVCSAYPL DGAARGEDTS MNLVQKYLEN YYDLKKDVKQ FVRRKDSGPV 60 

VKKIREMQKF LGLEVTGKLD SDTLEVMRKP RCGVPDVGHF RTFPGIPKWR KTKLTYRIVN 120 

YTPDLPKDAV DSAVEKALKV WEEVTPLTFS RLYEGEADIM ISFAVREHGD FYPFDGPGNV 180 

IiAHAYAPGPG INGDAHFDDD EQWTKDTTGT NLFLVAAHEI GHSLGLFHSA NTEALMYPLY 240 

HSLTDLTRFR LSQDDINGIQ SLYGPPPDSP ETPLVPTEPV PPBPGTPANC DPALSFDAVS 300 

TLRGEILIFK DRHFWRKSLR KLEPEIiHLIS SFWPSLPSGV DAAYEVTSKD LVFIFKGNQF 360 

WAIRGNEVRA GYPRGIHTLG FPPTVRKIDA AISDKEKNKT YFFVEDKYWR FDEKRNSMEP 420 

GFPKQIAEDF PGIDSKIDAV FEEFGFFYFF TGSSQLEFDP NAKKVTHTLK SNSWLNC 



Seq ID NO: 37 Nucleotide sequence: 
Nucleic Acid Accession #: NM_003246 

Coding sequence: 112,. 3 624 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I I I I I 

GGACGCACAG GCATTCCCCG CGCCCCTCCA GCCCTOGCCG CCCTCGCCAC CGCTCCCGGC 60 

CGCCGCGCTC CGGTACACAC AGGATCCCTG CTGGGCACCA ACAGCTCCAC CATGGGGCTG 120 

GCCTGGGGAC TAGGCGTCCT GTTCCTGATG CATGTGTGTG GCACCAACCG CATTCCAGAG 180 

TCTGGCGGAG ACAACAGCGT GTTTGACATC TTTGAACTCA CCGGGGCOGC CCGCAAGGGG 240 

TCTGGGCGCC GACTGGTGAA GGGCCCCGAC CCTTCCAGCC CAGCTTTCCG CATCGAGGAT 300 

GCCAACCTGA TCCCCCCTGT GCCTGATGAC AAGTTCCAAG ACCTGGTGGA TGCTGTGCGG 360 

GCAGAAAAGG GTTTCCTCCT TCTGGCATCC CTGAGGCAGA TGAAGAAGAC CCGGGGCACG 420 

CTGCTGGCCC TGGAGCGGAA AGACCACTCT GGCCAGGTCT TCAGCGTGGT GTCCAATGGC 480 

AAGGCGGGCA CCCTGGACCT CAGCCTGACC GTCCAAGGAA AGCAGCACGT GGTGTCTGTG 540 

GAAGAAGCTC TCCTGGCAAC CGGCCAGTGG AAGAGCATCA CCCTGTTTGT GCAGGAAGAC 600 

AGGGCCCAGC TGTACATCGA CTGTGAAAAG ATGGAGAATG CTGAGTTGGA CGTCCCCATC 660 

CAAAGCGTCT TCACCAGAGA CCTGGCCAGC ATCGCCAGAC TCCGCATCGC AAAGGGGGGC 720 

GTCAATGACA ATTTCCAGGG GGTGCTGCAG AATGTGAGGT TTGTCTTTGG AACCACACCA 780 

GAAGACATCC TCAGGAACAA AGGCTGCTCC AGCTCTACCA GTGTCCTCCT CACCCTTGAC 840 

AACAACGTGG TGAATGGTTC CAGCCCTGCC ATCCGCACTA ACTACATTGG CCACAAGACA 900 

AAGGACTTGC AAGCCATCTG CGGCATCTCC TGTGATGAGC TGTCCAGCAT GGTCCTGGAA 960 

CTCAGGGGCC TGOGCACCAT TGTGACCACG CTGCAGGACA GCATCCGCAA AGTGACTGAA 1020 

GAGAACAAAG AGTTGGCCAA TGAGCTGAGG CGGCCTCCCC TATGCTATCA CAACGGAGTT 1080 

CAGTACAGAA ATAACGAGGA ATGGACTGTT GATAGCTGCA CTGAGTGTCA CTGTCAGAAC 1140 

TCAGTTACCA TCTGCAAAAA GGTGTCCTGC CCCATCATGC CCTGCTCCAA TGCCACAGTT 1200 

CCTGATGGAG AATGCTGTCC TCGCTGTTGG CCCAGCGACT CTGCGGACGA TGGCTGGTCT 1260 

CCATGGTCCG AGTGGACCTC CTGTTCTACG AGCTGTGGCA ATGGAATTCA GCAGOGCGGC 1320 

CGCTCCTGCG ATAGCCTCAA CAACCGATGT GAGGGCTCCT CGGTCCAGAC ACGGACCTGC 1380 

CACATTCAGG AGTGTGACAA AAGATTTAAA CAGGATGGTG GCTGGAGCCA CTGGTCCCCG 1440 

TGGTCATCTT GTTCTGTGAC ATGTGGTGAT GGTGTGATCA CAAGGATCCG GCTCTGCAAC 1500 
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TCTCCCAGCC CCCAGATGAA TGGGAAACCC TGTGAAGGCG AAGCGCGGGA GACCAAAGCC 1560 

TGCAAGAAAG ACGCCTGCCC CATCAATGGA GGCTGGGGTC CTTGGTCACC ATGGGACATC 1620 

TGTTCTGTCA CCTGTGGAGG AGGGGTACAG AAACGTAGTC GTCTCTGCAA CAACCCCGCA 1680 

CCCCAGTTTG GAGGCAAGGA CTGCGTTGGT GATGTAACAG AAAACCAGAT CTGCAACAAG 1740 

5 CAGGACTGTC CAATTGATGG ATGCCTGTCC AATCCCTGCT TTGCCGGCGT GAAGTGTACT 1800 

AGCTACCCTG ATGGCAGCTG GAAATGTGGT GCTTGTCCCC CTGGTTACAG TGGAAATGGC 1860 

ATCCAGTGCA CAGATGTTGA TGAGTGCAAA GAAGTGCCTG ATGCCTGCTT CAACCACAAT 1920 

GGAGAGCACC GGTGTGAGAA CACGGACCCC GGCTACAACT GCCTGCCCTG CCCCCCACGC 1980 

TTCACCGGCT CACAGCCCTT CGGCCAGGGT GTCGAACATG CCACGGCCAA CAAACAGGTG 2040 

10 TGCAAGCCCC GTAACCCCTG CACGGATGGG ACCCACGACT GCAACAAGAA OGCCAAGTGC 2100 

AACTACCTGG GCCACTATAG CGACCCCATG TACCGCTGCG AGTGCAAGCC TGGCTACGCT 2160 

GGCAATGGCA TCATCTGCGG GGAGGACACA GACCTGGATG GCTGGCCCAA TGAGAACCTG 2220 

GTGTGCGTGG CCAATGCGAC TTACCACTGC AAAAAGGATA ATTGCCCCAA CCTTCCCAAC 2280 

TCAGGGCAGG AAGACTATGA CAAGGATGGA ATTGGTGATG CCTGTGATGA TGACGATGAC 2340 

15 AATGATAAAA TTCCAGATGA CAGGGACAAC TGTCCATTCC ATTACAACCC AGCTCAGTAT 2400 

GACTATGACA GAGATGATGT GGGAGACCGC TGTGACAACT GTCCCTACAA CCACAACCCA 2460 

GATCAGGCAG ACACAGACAA CAATGGGGAA GGAGACGCCT GTGCTGCAGA CATTGATGGA 2520 

GACGGTATCC TCAATGAACG GGACAACTGC CAGTACGTCT ACAATGTGGA CCAGAGAGAC 2580 

ACTGATATGG ATGGGGTTGG AGATCAGTGT GACAATTGCC CCTTGGAACA CAATCCGGAT 2640 

20 CAGCTGGACT CTGACTCAGA CCGCATTGGA GATACCTGTG ACAACAATCA GGATATTGAT 2700 

GAAGATGGCC ACCAGAACAA TCTGGACAAC TGTCCCTATG TGCCCAATGC CAACCAGGCT 2760 

GACCATGACA AAGATGGCAA ' GGGAGATGCC TGTGACCACG ATGATGACAA CGATGGCATT 2820 

CCTGATGACA AGGACAACTG CAGACTCGTG CCCAATCCCG ACCAGAAGGA CTCTGACGGC 2880 

GATGGTCGAG GTGATGCCTG CAAAGATGAT TTTGACCATG ACAGTGTGCC AGACATCGAT 2940 

25 GACATCTGTC CTGAGAATGT TGACATCAGT GAGACCGATT TCCGCCGATT CCAGATGATT 3000 

CCTCTGGACC CCAAAGGGAC ATCCCAAAAT GACCCTAACT GGGTTGTACG CCATCAGGGT 3060 

AAAGAACTCG TCCAGACTGT CAACTGTGAT CCTGGACTCG CTGTAGGTTA TGATGAGTTT 3120 

AATGCTGTGG ACTTCAGTGG CACCTTCTTC ATCAACACCG AAAGGGACGA TGACTATGCT 3180 

GGATTTGTCT TTGGCTACCA GTCCAGCAGC CGCTTTTATG TTGTGATGTG GAAGCAAGTC 3240 

30 ACCCAGTCCT ACTGGGACAC CAACCCCACG AGGGCTCAGG GATACTCGGG CCTTTCTGTG 3300 

AAAGTTGTAA ACTCCACCAC AGGGCCTGGC GAGCACCTGC GGAACGCCCT GTGGCACACA 3360 

GGAAACACCC CTGGCCAGGT GCGCACCCTG TGGCATGACC CTCGTCACAT AGGCTGGAAA 3420 

GATTTCACCG CCTACAGATG GCGTCTCAGC CACAGGCCAA AGACGGGTTT CATTAGAGTG 3480 

GTGATGTATG AAGGGAAGAA AATCATGGCT GACTCAGGAC CCATCTATGA TAAAACCTAT 3540 

35 GCTGGTGGTA GACTAGGGTT GTTTGTCTTC TCTCAAGAAA TGGTGTTCTT CTCTGACCTG 3600 

AAATACGAAT GTAGAGATCC CTAATCATCA AATTGTTGAT TGAAAGACTG ATCATAAACC 3660 

AATGCTGGTA TTGCACCTTC TGGAACTATG GGCTTGAGAA AACCCCCAGG ATCACTTCTC 3720 

CTTGGCTTCC TTCTTTTCTG TGCTTGCATC AGTGTGGACT CCTAGAACGT GCGACCTGCC 3780 

TCAAGAAAAT GCAGTTTTCA AAAACAGACT CATCAGCATT CAGCCTCCAA TGAATAAGAC 3840 

40 ATCTTCCZAAG CATATAAACA ATTGCTTTGG TTTCCTTTTG AAAAAGCATC TACTTGCTTC 3900 

AGTTGGGAAG GTGCCCATTC CACTCTGCCT TTGTCACAGA GCAGGGTGCT ATTGTGAGGC 3960 

CATCTCTGAG CAGTGGACTC AAAAGCATTT TCAGGCATGT CAGAGAAGGG AGGACTCACT 4020 

AGAATTAGCA AACAAAACCA CCCTGACATC CTCCTTCAGG AACACGGGGA GCAGAGGCCA 4080 

AAGCACTAAG GGGAGGGCGC ATACCCGAGA CGATTGTATG AAGAAAATAT GGAGGAACTG 4140 

45 TTACATGTTC GGTACTAAGT CATTTTCAGG GGATTGAAAG ACTATTGCTG GATTTCATGA 4200 

TGCTGACTGG CGTTAGCTGA TTAACCCATG TAAATAGGCA CTTAAATAGA AGCAGGAAAG 4260 

GGAGACAAAG ACTGGCTTCT GGACTTCCTC CCTGATCCCC ACCCTTACTC ATCACCTTGC 4320 

AGTGGCCAGA ATTAGGGAAT CAGAATCAAA CCAGTGTAAG GCAGTGCTGG CTGCCATTGC 4380 

CTGGTCACAT TGAAATTGGT GGCTTCATTC TAGATGTAGC TTGTGCAGAT GTAGCAGGAA 4440 

50 AATAGGAAAA CCTACCATCT CAGTGAGCAC CAGCTGCCTC CCAAAGGAGG GGCAGCCGTG 4500 

CTTATATTTT TATGGTTACA ATGGCACAAA ATTATTATCA ACCTAACTAA AACATTCCTT 4560 

TTCTCTTTTT TCCGTAATTA CTAGGTAGTT TTCTAATTCT CTCTTTTGGA AGTATGATTT 4620 

TTTTAAAGTC TTTACGATGT AAAATATTTA TTTTTTACTT ATTCTGGAAG ATCTGGCTGA 4680 

AGGATTATTC ATGGAACAGG AAGAAGCGTA AAGACTATCC ATGTCATCTT TGTTGAGAGT 4740 

55 CTTCGTGACT GTAAGATTGT AAATACAGAT TATTTATTAA CTCTGTTCTG CCTGGAAATT 4800 

TAGGCTTCAT ACGGAAAGTG TTTGAGAGCA AGTAGTTGAC ATTTATCAGC AAATCTCTTG 4860 

CAAGAACAGC ACAAGGAAAA TCAGTCTAAT AAGCTGCTCT GCCCCTTGTG CTCAGAGTGG 4920 

ATGTTATGGG ATTCCTTTTT TCTCTGTTTT ATCTTTTCAA GTGGAATTAG TTGGTTATCC 4980 

ATTTGCAAAT GTTTTAAATT GCAAAGAAAG CCATGAGGTC TTCAATACTG TTTTACCCCA 5040 

60 TCCCTTGTGC ATATTTCCAG GGAGAAGGAA AGCATATACA CTTTTTTCTT TCATTTTTCC 5100 

AAAAGAGAAA AAAATGACAA AAGGTGAAAC TTACATACAA ATATTACCTC ATTTGTTGTG 5160 

TGACTGAGTA AAGAATTTTT GGATCAAGCG GAAAGAGTTT AAGTGTCTAA CAAACTTAAA 5220 

GCTACTGTAG TACCTAAAAA GTCAGTGTTG TACATAGCAT AAAAACTCTG CAGAGAAGTA 5280 

TTCCCAATAA GGAAATAGCA TTGAAATGTT AAATACAATT TCTGAAAGTT ATGTTTTTTT 5340 

65 TCTATCATCT GGTATACCAT TGCTTTATTT TTATAAATTA TTTTCTCATT GCCATTGGAA 5400 

TAGAATATTC AGATTGTGTA GATATGCTAT TTAAATAATT TATCAGGAAA TACTGCCTGT 5460 

AGAGTTAGTA TTTCTATTTT TATATAATGT TTGCACACTG AATTGAAGAA TTGTTGGTTT 5520 

TTTCTTTTTT TTGTTTTTTT TTTTTTTTTT TTTTTTTTTG CTTTTGACCT CCCATTTTTA 5580 

CTATTTGCCA ATACCTTTTT CTAGGAATGT GCTTTTTXTT GTACACATTT TTATCCATTT 5640 

70 TACATTCTAA AGCAGTGTAA GTTGTATATT ACTGTTTCTT ATGTACAAGG AACAACAATA 5700 
AATCATATGG AAATTTATAT TT 

Seq ID NO: 38 Protein sequence: 
Protein Accession #: NP_003237 
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MGLAWGLGVL FLMHVCGTNR IPESGGDNSV 
IEDANLIPPV PDDKFQDLVD AVRAEKGFLL 
SNGKAGTLDL SLTVQGKQHV VSVBBALLAT 
VPIQSVFTRD LASIARLRIA KGGVNDNFQG 
TLDNNWNGS SPAIRTNYIG HKTKDLQAIC 
VTBENKELAN BLRRPPLCYH NGVQYRNNEB 
ATVPDGBCCP RCWPSDSADD GWSPWSBWTS 
RTCHIQBCDK RFKQDGGWSH WSPWSSCSVT 
TKACKKDACP INGGWGPWSP WDICSVTCGG 
CNKQDCPIDG CLSNPCFAGV KCTSYPDGSW 
NHNGEHRCEN TDPGYNCLPC PPRFTGSQPF 
AKCNYLGHYS DPMYRCECKP GYAGNGIICG 
LPNSGQEDYD KDGIGDACDD DDDNDKIPDD 
HNPDQADTDN NGEGDACAAD IDGDGILNER 
NPDQLDSDSD RIGDTCDNNQ DIDEDGHQNN 
DGIPDDKDNC RLVPNPDQKD SDGDGRGDAC 
QMIPLDPKGT SQNDPNWWR HQGKELVQTV 
DYAGFVFGYQ SSSRFYWMW KQVTQSYWDT 
WHTGNTPGQV RTLWHDPRHI GWKDPTAYRW 
KTYAGGRLGL FVFSQEMVFF SDLKYECRDP 



31 41 51 

I I I 

FDIFELTGAA RKGSGRRLVK GPDPSSPAFR 60 

LASLRQMKKT RGTLLALERK DHSGQVFSW 120 

GQWKSITLFV QEDRAQLYXD CEKMENAELD 180 

VLQNVRFVFG TTPEDILRNK GCSSSTSVLL 240 

GISCDBLSSM VLELRGLRTI VTTLQDSIRK 300 

WTVDSCTECH CQNSVTICKK VSCPIMPCSN 360 

CSTSCGNGIQ QRGRSCDSLN NRCEGSSVQT 420 

CGDGVITRIR XiCHSPSPQMN GKPCEGEARE 4 BO 

GVQKRSR1.CN NPAPQFGGKD CVGDVTENQI 540 

KCGACPPGYS GNGIQCTDVD ECKEVPDACP 600 

GQGVEHATAN KQVCKPRNPC TDGTHDCNKN 660 

EDTDLDGWPK ENLVCVANAT YKCKKDNCPN 720 

RDNCPPHYNP AQYDYDRDDV GDRCDNCPYN 780 

DNCQYVYNVD QRDTDMDGVG DQCDNCPLEH 840 

LDNCPYVPNA NQADHDKDGK GDACDHDDDN 900 

KDDPDHDSVP DIDDICPENV DISBTDFRRF 960 

NCDPGIAVGY DEFNAVDPSG TFPINTERDD 1020 

NPTRAQGYSG LSVKWNSTT GPGEHLRNAL 1080 

RLSHRPKTGF IRWMYEGKK IKADSGPIYD 1140 



Seq ID NO: 39 Nucleotide sequence: 
Nucleic Acid Accession #: BC004299 

Coding sequence: 69.. 123 5 (underlined sequences correspond to start and stop codons) 



1 
I 

CCCGACCCGT 
GTGCGGCCAT 
CCCTGGACGC 
ACAAGGGCTC 
ACGAGAGGAA 
TGCTGGGAAA 
GGGAGCGGCT 
GGAAGAAGCA 
TCTCCCGGGA 
AGAAGGAGGA 
ACCACGAGGG 
CGTACGGGCT 
CCTTCTTCTC 
CAGGGCACCC 
GCTCCCTGGC 
CCCCATCTCC 
CCCACCTGGG 
TGAGCCAGGT 
CTCCTGGCCA 
CCCAGGTGAC 
CGGCCACGTA 
CCCTCGCGCC 
TTTCCTCCCA 
GCTGGACTCT 
GCCCACATTT 
CTCCCAGTGG 
GACAGACTTG 
ATTAATAAAG 
TGATAATTTT 
TCCAAAGTGA 
GATTTGAGAA 
TATTTTATTT 
ACACACACTT 
TATAAAATTC 
GATGTTTAAA 



11 
I 

GCGAGGGCCA 
GGCTTCGCTG 
CGAGCTGTCG 
CGAGAGCCGT 
ACGGCTGGCA 
GTCGTGGAAG 
GCGCCTGCAG 
GGCCAAGCGG 
CCAGAACGCC 
CAGGGGTGAG 
GCCGGCTGGT 
GCCCACACCT 
CTCCCCCTGC 
GTACTCACCG 
CCTTGGCCAG 
TGCCTATTAC 
CCAGCTTTCC 
GGAACTCCTG 
CCCAGACTCC 
ACCAACGGGT 
CTACAACAGC 
CTCTCCTTCT 
CCGCTCAGGG 
CCTTATCCGA 
TAAGTATATT 
AATGTTCACT 
ATAGCCAAGG 
GAAGATGGGG 
GTGTGCACAG 
CCACAAAATT 
ATTAACCAGT 
TAAATATACA 
CAAGAGCCAC 
AGTGTATTAG 
AACAAAACAG 



21 
I 

GGTCCGCGCC 
CTGGGAGCCT 
GATGGACAAT 
ATCCGGCGGC 
GTGCAGAACC 
GCGCTGACGC 
CACATGCAGG 
CTGTGCAAGC 
CTGCCGGAGA 
TACTCCCCCG 
GGTGGCGGCG 
CCTGAAATGT 
CAGGAGGAGC 
GAGTACGCCC 
TCCCCCGGCG 
TCCCCGGCCA 
CCGCCTCCTG 
GGGGACATGG 
GCCACAGGGG 
CCCACAGAGA 
TACAGTGTGT 
TGTGCCTTGA 
CAGGGAGGTC 
GTGCCGCCTC 
CCTTCAAGTG 
GACGTCTTTT 
TCCCTTCTGG 
AAATTTGACT 
CCCAAGGACC 
TCAAAGGGAC 
ATGGCTAACT 
TTTTAAAGCA 
CGCGCCCAGC 
TTTCATTACA 
GCTGTTGTAA 



31 
I 

TGCCCCGCCA 
ACCCTTGGCC 
CGCCGCCGGC 
CCATGAACGC 
CGGACCTGCA 
TGTCCCAGAA 
ACTACCCCAA 
GCGTGGACCC 
AGAGAAGCGG 
GCACTGCCCT 
GCGGCACCCC 
CTCCCCTGGA 
ATGGCCATCC 
CAAGCCCTCT 
TCTCCATGAT 
CCTACCACCC 
AGCACCCTGG 
ATCGCAATGA 
CCATGGCCCT 
CCAGCCTCAT 
CATAGAGCTG 
GTGGCAGAGG 
TGAACTGCGG 
TATCCCCTTC 
AGTTTTCCTC 
CTTGGTAGCC 
TCCAGTTTTC 
CATTAATGAG 
ACGAGGCTTT 
TCATACAATT 
ATATCACAGA 
GTTCTTTTTT 
CTACATTTAT 
TAGGAGAAAT 
AAAAAAAAAA 



41 
I 

GGCGAAGCGA 
CGAGGGTCTC 
CGTCCCCCGG 
CTTCATGGTT 
CAACGCCGAG 
GAGGCCGTAC 
CTACAAGTAC 
GGGCTTCCTT 
CAGCCGGGGG 
GCCCAGCCTC 
GAGCAGTGTG 
CGTGCTGGAG 
CCGCCGCATC 
CCACTGTAGC 
GTCCCCTGTA 
ACTCCACTCC 
CTTCGACGCC 
ATTCGACCAG 
CAGTGGGCAT 
CTCCX3TCCTG 
GAGGCGCCCC 
AGCCGTCCAG 
CCCCAGAGCC 
CCCACGTTCC 
CAGCCCCTGA 
ATCATCGAAA 
TGATTTAGGG 
CTCGCTAACC 
CTGCACTTTC 
TGAGAAAAAA 
AAATGGGATT 
TTTGTTAATT 
AATTTTCATT 
TATATTTCTA 
AAAAAAAAA 



51 
I 

GGCGACCCGC 
GAGTGCCCGG 
CCCCCGGGGG 
TGGGCCAAGG 
CTCAGCAAGA 
GTGGACGAGG 
CGGCCGCGCA 
CTGAGCTCCC 
GCGCTGGGGG 
CGGGGCTGCT 
GACACGTACC 
CCGGAGCAGA 
CCCCACCTGC 
CACCCCCTGG 
CCCGGCTGTC 
AACCTCCAAG 
CTGGATCAAC 
TATTTGAACA 
GTTCCGGTCT 
GCTGATGCCA 
GTCCGGTCAG 
CCACACCAGC 
TTTGGCCTAA 
AGCCCCTGCA 
GAGTTGCTGT 
CTAATGGGGG 
TTCTCTCAAG 
TACGATCTGG 
TGCACCCCCT 
CAGTCAACCT 
GAGTTAAAAC 
TGTTTATTAT 
CTCTTTTACC 
AACATTTTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



Seq ID NO: 40 Protein sequencer 
Protein Accession #: AAH04299 



1 11 21 31 41 51 

I I I I I I 

MASliDGAYPW PEGLBCPALD AEIiSDGQSPP AVPRPPGDKG SESRIRRPMN AFMVWAKDBR 60 
KRIAVQNPDL HNABLSKMLG KSWKALTLSQ KRPYVDBAER LRLQHMQDYP NYKYRPRRKK 120 
QAKRLCKRVD PGFIiLSSIiSR DQNALPEKRS GSRGALGEKB DRGEYSPGTA LPSLRGCYHB 180 
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GPAGGGGGGT PSSVDTYPYG LPTPPEMSPL DVLEPBQTFF SSPCQEEHGH PRRIPHLPGH 240 

PYSPEYAPSP LHCSHPLGSL ALGQSPGVSM MSPVPGCPPS PAYYSPATYH PLHSNLQAHL 300 

GQLSPPPEHP GFDALDQLSQ VELLGDMDRN EFDQYLNTPG HPDSATGAMA LSGHVPVSQV 360 
TPTGPTETSL ISVLADATAT YYNSYSVS 



Seq ID NO: 41 Nucleotide sequence: 
Nucleic Acid Accession #: NM_004449 

Coding sequence: 1..1389 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

1 I I ! I I 

ATGATTCAGA CTGTCCCGGA CCCAGCAGCT CATATCAAGG AAGCCTTATC AGTTGTGAGT 60 

GAGGACCAGT CGTTGTTTGA GTGTGCCTAC GGAACGCCAC ACCTGGCTAA GACAGAGATG 120 

ACCGOGTCCT CCTCCAGCGA CTATGGACAG ACTTCCAAGA TGAGCCCACG CGTCCCTCAG 180 

CAGGATTGGC TGTCTCAACC CCCAGCCAGG GTCACCATCA AAATGGAATG TAACCCTAGC 240 

CAGGTGAATG GCTCAAGGAA CTCTCCTGAT GAATGCAGTG TGGCCAAAGG CGGGAAGATG 300 

GTGGGCAGCC CAGACACCGT TGGGATGAAC TACGGCAGCT ACATGGAGGA GAAGCACATG 360 

CCACCCCCAA ACATGACCAC GAACGAGCGC AGAGTTATCG TGCCAGCAGA TCCTACGCTA 420 

TGGAGTACAG ACCATGTGCG GCAGTGGCTG GAGTGGGCGG TGAAAGAATA TGGCCTTCCA 480 

GACGTCAACA TCTTGTTATT CCAGAACATC GATGGGAAGG AACTGTGCAA GATGACCAAG 540 

GACGACTTCC AGAGGCTCAC CCCCAGCTAC AACGCCGACA TCCTTCTCTC ACATCTCCAC 600 

TACCTCAGAG AGACTCCTCT TCCACATTTG ACTTCAGATG ATGTTGATAA AGCCTTACAA 660 

AACTCTCCAC GGTTAATGCA TGCTAGAAAC ACAGATTTAC CATATGAGCC CCCCAGGAGA 720 

TCAGCCTGGA CCGGTCACGG CCACCCCACG CCCCAGTCGA AAGCTGCTCA ACCATCTCCT 780 

TCCACAGTGC CCAAAACTGA AGACCAGCGT CCTCAGTTAG ATCCTTATCA GATTCTTGGA 840 

CCAACAAGTA GCCGCCTTGC AAATCCAGGC AGTGGCCAGA TCCAGCTTTG GCAGTTCCTC 900 

CTGGAGCTCC TGTCGGACAG CTCCAACTCC AGCTGCATCA CCTGGGAAGG CACCAACGGG 960 

GAGTTCAAGA TGACGGATCC CGACGAGGTG GCCCGGCGCT GGGGAGAGCG GAAGAGCAAA 1020 

CCCAACATGA ACTACGATAA GCTCAGCCGC GCCCTCCGTT ACTACTATGA CAAGAACATC 1080 

ATGACCAAGG TCCATGGGAA GCGCTACGCC TACAAGTTCG ACTTCCACGG GATCGCCCAG 1140 

GCCCTCCAGC CCCACCCCCC GGAGTCATCT CTGTACAAGT ACCCCTCAGA CCTCCCGTAC 1200 

ATGGGCTCCT ATCACGCCCA CCCACAGAAG ATGAACTTTG TGGCGCCCCA CCCTCCAGCC 1260 

CTCCCCGTGA CATCTTCCAG TTTTTTTGCT GCCCCAAACC CATACTGGAA TTCACCAACT 1320 

GGGGGTATAT ACCCCAACAC TAGGCTCCCC ACCAGCCATA TGCCTTCTCA TCTGGGCACT 1380 
TACTA CTAA 



Seq ID NO: 42 Protein sequence: 
Protein Accession #: NP_004440 



1 11 21 

I I I 

KIQTVPDPAA HIKEAIiSWS EDQSLFECAY 
QDWLSQPPAR VTIKMBCNPS QVNGSRNSPD 
PPPNMTTNER RVTVPADPTL WSTDHVRQWL 
DDFQRLTPSY NADILLSHLH YLRETPLPHL 
SAWTGHGHPT PQSKAAQPSP STVPKTEDQR 
LELLSDSSNS SCITWEGTNG EFKMTDPDEV 
MTKVHGKRYA YKFDFHGIAQ ALQPHPPESS 
LPVTSSSFFA APNPYWNSPT GGIYPNTRLP 



31 41 51 

I I I 

GTPHLAKTEM TASSSSDYGQ TSKMSPRVPQ 60 

ECSVAKGGKM VGSPDTVGMN YGSYMEEKHM 120 

EWAVKEYGLP DVNILLFQNI DGKELCKMTK 180 

TSDDVDKALQ NSPRLMHARN TDLPYEPPRR 240 

PQLDPYQILG PTSSRLANPG SGQIQLWQPL 300 

ARRWGERKSK PNMNYDKLSR ALRYYYDKNI 360 

LYKYPSDLPY MGSYHAHPQK MNFVAPHPPA 420 
TSHMPSHLGT YY 



Seq ID NO: 43 Nucleotide sequence: 
Nucleic Acid Accession #: NM_005100 

Coding sequence: 192.. 5537 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCTTCTTTTA AGGAGTTTGC CGCGAGCGCG TCTCCTTCAT TCGCAGGCTG GGCGCGTTCG 60 

CAGTOGGCTG GCGGCGAAGG AAGGCGCTCT CGGGACCTCA CGGGCGCGCG TCTTTTGGCT 120 

CTTGCCCCTG TCCCTGCGGC TTGGGGAAAG CGTAACCOGG CGGCTAGGCG CGGGAGAAGT 180 

GCGGAGGAGC CATGGGCGCC GGGAGCTCCA COGAGCAGCG CAGCCCGGAG CAGCCGCCCG 240 

AGGGGAGCTC CACGCCGGCT GAGCCCGAGC CCAGCGGCGG CGGCCCCTCG GCCGAGGCGG 300 

CGCCAGACAC CACCGCGGAC CCCGCCATCG CTGCCTCGGA CCCCGCCACC AAGCTCCTAC 360 

AGAAGAATGG TCAGCTGTCC ACCATCAATG GCGTAGCTGA GCAAGATGAG CTCAGCCTCC 420 

AGGAGGGTGA CCTAAATGGC CAGAAAGGAG CCCTGAACGG TCAAGGAGCC CTAAACAGCC 480 

AGGAGGAAGA AGAAGTCATT GTCACGGAGG TTGGACAGAG AGACTCTGAA GATGTGAGCG 540 

AAAGAGACTC CGATAAAGAG ATGGCTACTA AGTCAGCGGT TGTTCACGAC ATCACAGATG 600 

ATGGGCAGGA GGAGAACCGA AATATCGAAC AGATTCCTTC TTCAGAAAGC AATTTAGAAG 660 

AGCTAACACA ACCCACTGAG TCCCAGGCTA ATGATATTGG ATTTAAGAAG GTGTTTAAGT 720 

TTGTTGGCTT TAAATTCACT GTGAAAAAGG ATAAGACAGA GAAGCCTGAC ACTGTCCAGC 780 
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TACTCACTGT GAAGAAAGAT 6AAGGG6AGG 
CCAGCCTTGG GGCTGGAGAA GCAGCATCCA 
AACCCGAAGA GACCCTGAAG CGTGAGCAAA 
CTGGCCAAGC AGTGGAGGAA TGCAAAGAGG 
GCAAGTCTGC AGAATCTCCG ACTAGTCCCG 
AATTCTTCAC TCAAGGTTGG GCCGGCTGGC 
AGGATGAAGT GGAAGCTTCA GAGAAGAAAA 
AAGAAGACGG AAAGGCAGAG GTTGCCTCCG 
CACAGGAGCC GGCAGAAAGT GCCCACGAGC 
AGCTGCCCTC AGAGGAGCAA GTCAGTGGCT 
CGTTGGCGAC AGAAGTGTTT GATGAGAAAA 
AAGTCCACGT CAGCACCGTG GAGGAGAGAA 
CAGCAG6GTC TGTGCCAGCT GAAGAATTGG 
AACCTGCCAA GGAGCTGGTG AAGCTCAAAG 
AGGGAGCTGA CCTCAGTCCT GATGAGAAGG 
GTGAGGTGGA AATGCTGTCA TCACAGGAGA 
AGCTTTTTAC CAGCACTGGC TTAAAAAAGC 
GAGGAGGAGA CGAGGAATCA GGGGAGCACA 
AGGAGGAGCA AAAGGGCGAG AGCTCTGCCT 
GTCTGGAAAA GGGCTTAGCC GAGGTGCAGC 
CCGATGGAGA GAAAAAAAGA GAAGGTGTCA 
CGCCCAAGAA GCGTGTTAGA CGGCCTTCGG 
TCAAGAGCGC TACCTTGTCT TCCACCGAGA 
AAGGGAGCGT GGAAGAGCCA AAGCCGGAAG 
CTTGGGAAGC TTTAATTTGT GTGGGATCAT 
CTGATGAGGA AGGGGGACCA AAAGCAATGG 
GAAAAGACAA AGAGACGGGG ACAGACGGGA 
GGCAGGGAAG TTCCTCCCCG GAGCAAGCTG 
CCTGGGAGTC ATTTAAAAGG TTAGTCACGC 
AGAAAAGCGA AGACTCCATA GCTGGGTCTG 
CCGGTAAAGA AGAATCCTGG GTCTCAATCA 
GGCCAGATGG GAAACAAGAA CAAGCCCCTG 
AAGATGACTC TGATGTCCCG GCCGTGGTCC 
AGAAAATGGA GGCACAGCAA GCCCAAAAAG 
CTGAGGTGTC CAAGGAGCTC AGCGAGAGTC 
ACGGGACGAG GGCAGCTACC ATTATTGAAG 
TGACAGAACC TCTTGAACAA GTAGAAGCTG 
AAAGAGAAGT AATTGCAGAA GAAGAACCCC 
GAGAGGCCCG GGGCGACACG GTCGTTAGTG 
CTGCAGAAAC TGCAGGGCCA TTGGGTTCCG 
AGACCACAGA AATGGTGTCA GCAGTCTCCC 
AGGCCACTCC GGTGCAGGAG GTGGAAGGTG 
GGACTCAAGA GGTCCTCCAG GCAGTGGCAG 
GCACCGGTGG GCCAGAAGAT GTGCTTCAGC 
AAGAGCAGGC TGAAGCGTCG GGTCTGAAGA 
CTCAGGAGGC AAAAACTGAG CCTTTTACAC 
AAAGCTTTGA AAAAGCTCCT CAAGTCACAG 
CTTGTCAAGC CGAAACCTTA GCTGGGGTAA 
TCCCCCCTGA CTCGGTGGAA ACCCCTACAG 
CCGACTTTGA CGCACCAGGC ACAACCCAGA 
ATGAGGTCGC ATCTGGTACC CAGTCAGGGG 
AAGAGAGGCC TCCAGCACCT TCCAGTTTTG 
AGATGGAAGA CACTCTAGAG CATACAGATA 
TGTCAAAGAC TGAGGGGACT CAAGAGGCTG 
TACCATTTTT CGAAGGACTT GAGGGGTCTA 
AGGTCACTGA AGTTGCCCTT AAAGGTGAAG 
ATGCTCTTGA ACTGCAGAGT CACGCTAAGT 
TAGTTCAAGT CGAAAGGGAG AAAACAGAAG 
TTGAGCACGA AACAGCTGTT ACCGTATCTG 
TGAATGTGCC CATCATAGAT GGGGCAAAGG 
CCTGCCTAGG TCAAGAGGAG GCAGTATGCA 
CATTCACTCT AACAGCGGCT GCAGAGGAGO 
TAGAAACAGG TGAAACGTTG GAGCCTGCAG 
CTGAAAAAAA TGAAGACTTT GCCGCTCATC 
ACTGTCAGGC AAAATCGACA CCAGTGATAG 
CCGACCTGGA AGGAGAGAAA ACCACATCAC 
AGGTTGCTTG CCAGGAGGTC AAAGTGAGTG 
GGATTTTGGA ACTTGAGACC AAAAGCAGTA 
TTGACCAGTT TGTACGTACA GAAGAAACAG 
CACAAGCTCA CGTGATAAAA GCTGACAGCC 
GAGAGGAACC TCAGGCCTCT GCACAGGATG 
CAGAGTCAAC CGCAGTGGGA CAAGCACATT 
CAGAAAAGAC CATGACTGTT GAGGTAGAAG 
AGGTCGTCCT CCCATCTGAG GAAGAGGGAG 
ATGATGGTCA TGCCTTGTTA GCAGAAAGAA 
ATGAAAAAGG TGATGATGTT GATGACCCTG 



GAGCAGCAGG GGCTGGCGAC CACCAGGACC 84 0 

AAGAAAGCGA ACCCAAACAA TCTACAGAGA 900 

GCCACGCAGA AATTTCTCCC CCAGCCGAAT 960 

AAGGAGAAGA GAAACAAGAA AAAGAACCTA 1020 

TGACCAGTGA AACAGGATCA ACCTTCAAAA 1080 

GCAAAAAGAC CAGTTTCAGG AAGCCGAAGG 1140 

AGGAACAAGA GCCAGAAAAA GTAGACACAG 1200 

AGAAACTGAC CGCCTCCGAG CAAGCCCACC 1260 

CCCGGTTATC AGCTGAATAT GAGAAAGTTG 1320 

CGCAGGGACC TTCTGAAGAG AAACCTGCTC 1380 

TAGAAGTCCA CCAAGAAGAG GTTGTGGCCG 1440 

CCGAAGAGCA GAAAACGGAG GTGGAAGAAA 1500 

TTGGAATGGA TGCAGAACCT CAGGAAGCCG 1560 

AAACGTGTGT TTCCGGAGAG GACCCTACAC 1620 

TGCTGTCCAA ACCCCCCGAA GGCGTTGTGA 1680 

GAATGAAGGT GCAGGGAAGT CCACTAAAGA 1740 

TTTCTGGAAA GAAACAGAAA GGGAAAAGAG 1800 

CTCAGGTTCC AGCCGATTCT CCGGACAGCC 1860 

CATCCCCTGA GGAGCCCGAG GAGATCACGT 1920 

AGGATGGGGA AGCTGAAGAA GGAGCTACTT 1980 

CTCCCTGGGC ATCATTCAAA AAGATGGTGA 2040 

AAAGTGATAA AGAAGATGAG CTGGACAAGG 2100 

GCACAGCCTC TGAAATGCAA GAAGAAATGA 2160 

AACCAAAGCG CAAGGTGGAT ACCTCAGTAT 2220 

CCAAGAAAAG AGCAAGGAGA AGGTCCTCTT 2280 

GAGGAGACCA CCAGAAAGCT GATGAGGCCG 2340 

TCCTTGCTGG TTCCCAAGAA CATGATCCAG 2400 

GAAGCCCTAC CGAAGGGGAG GGCGTTTCCA 2460 

CAAGAAAAAA ATCAAAGTCC AAGCTGGAAG 2520 

GTGTAGAACA TTCCACTCCA GACACTGAAC 2580 

AGAAGTTTAT TCCTGGACGA AGGAAGAAAA 2640 

TTGAAGACGC AGGGCCAACA GGGGCCAACG 2700 

CTCTGTCTGA GTATGATGCT GTAGAAAGGG 2760 

GCGCAGAGCA GCCCGAGCAG AAGGCAGCCA 2820 

AGGTTCATAT GATGGCAGCA GCTGTCGCTG 2880 

AAAGGTCTCC TTCTTGGATA TCTGCTTCAG 2940 

AAGCCGCACT GTTAACTGAG GAGGTATTGG 3000 

CCACGGTTAC TGAACCTCTG CCAGAGAACA 3060 

AGGCGGAATT GACCCCCGAA GCTGTGACAG 3120 

AAGAAGGAAC CGAAGCATCT GCTGCTGAAG 3180 

AGTTAACCGA CTCCCCAGAC ACCACAGAGG 3240 

GCGTACCTGA CATAGAAGAG CAAGAGAGGC 3300 

AAAAAGTGAA AGAGGAATCC CAGCTGCCTG 3360 

CTGTGCAGAG AGCAGAGGCA GAAAGACCAG 3420 

AAGAGACGGA TGTAGTGTTG AAAGTAGATG 3480 

AAGGGAAGGT GGTGGGGCAG ACCACCCCAG 3540 

AGAGCATAGA GTCCAGTGAG CTTGTAACCA 3600 

AATCACAGGA GATGGTGATG GAACAGGCTA 3660 

ACAGTGAGAC TGATGGAAGC ACCCCCGTAG 3720 

AAGAOGAGAT TGTGGAAATC CATGAGGAGA 3780 

GCACAGAAGC AGAGGCAGTT CCTGCACAGA 3840 

TGTTCCAGGA AGAAACTAAA GAACAATCAA 3900 

AAGAGGTGTC AGTGGAAACT GTATCCATTC 3960 

ACCAGTATGC TGATGAGAAA ACCAAAGACG 4020 

TAGACACAGG CATAACAGTC AGTOGGGAAA 4060 

GGACAGAAGA AGCTGAATGT AAAAAGGATG 4140 

CTCCTCCATC CCCCGTGGAG AGAGAGATGG 4200 

CAGAGCCAAC CCATGTGAAT GAAGAGAAGC 4260 

AAGAGGTCAG TAAGCAGCTC CTCCAGACAG 4320 

AAGTCAGCAG TTTGGAAGGA AGCCCTCCTC 4380 

CCAAAATTCA AGTTCAGAGC TCTGAGGCAT 4440 

AAAAGGTCTT AGGAGAAACT GCCAACATTT 4500 

GTGCACATTT AGTTCTGGAA GAGAAATCCT 4560 

CAGGGGAAGA TGCTGTGCCC ACAGGGCCCG 4620 

TATCTGCTAC TACCAAGAAA GGCTTAAGTT 4680 

TGAAGTGGAA GTCAGATGAA GTCGATGAGC 4740 

TAGCAATTGA GGATTTAGAG CCTGAAAATG 4800 

AACTTGTCCA AAACATCATC CAGACAGCCG 4860 

CCACCGAAAT GTTGACGTCT GAGTTACAGA 4920 

AGGACGCTGG ACAGGAAACG GAGAAAGAAG 4980 

AAACACCAAT TACTTCAGCC AAAGAGGAGT 5040 

CTGATATTTC CAAAGACATG AGTGAAGCCT 5100 

GTTCCACTGT AAATGATCAG CAGCTGGAAG 5160 

GTGGAGCTGG AACAAAGTCT GTGCCAGAAG 5220 

TAGAGAAGTC ACTAGTTGAA CCGAAAGAAG 5280 

AAAACCAGAA CTCAGCCCTG GCTGATACTG 5340 
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ATGCCTCAGG AGGCTTAACC AAAGAGTCCC CAGATACAAA TGGACCAAAA CAAAAAGAGA 5400 

AGGAGGATGC CCAGGAAGTA GAATTGCAGG AAGGAAAAGT GCACAGTGAA TCAGATAAAG 5460 

CGATCACACC CCAAGCACAG GAGGAGTTAC AGAAACAAGA GAGAGAATCT GCAAAGTCAG 5520 

AACTTACAGA ATCTTAAAAC ATCATGCAGT TAAACTCATT GTCTGTTTGG AAGACCAGAA 5580 

5 TGTGAAGACA AGTAGTAGAA GAAAATGAAT GCTGCTGCTG AGACTGAAGA CCAGTATTTC 5640 

AGAACTTTGA GAATTGGAGA GCAGGCACAT CAACTGATCT CATTTCTAGA GAGCCCCTGA 5700 

CAATCCTGAG GCTTCATCAG GAGCTAGAGC CATTTAACAT TTCCTCTTTC CAAGACCAAC 5760 

CTACAATTTT CCCTTGATAA CCATATAAAT TCTGATTTAA GGTCCTAAAT TCTTAACCTG 5820 

GAACTGGAGT TGGCAATACC TAGTTCTGCT TCTGAAACTG GAGTATCATT CTTTACATAT 5880 

10 TTATATGTAT GTTTTAAGTA GTCCTCCTGT ATCTATTGTA TATTTTTTTC TTAATGTTTA 5940 

AGGAAATGTG CAGGATACTA CATGCTTTTT GTATCACACA GTATATGATG GGGCATGTGC 6000 

CATAGTGCAG GCTTGGGGAG CTTTAAGCCT CAGTTATATA ACCCACAAAA AACAGAGCCT 6060 

CCTAGATGTA ACATTCCTGA TCAAGGTACA ATTCTTTAAA ATTCACTAAT GATTGAGGTC 6120 

CATATTTAGT GGTACTCTGA AATTGGTCAC TTTCCTATTA CACGGAGTGT GCCAAAACTA 6180 

15 AAAAGCATTT TGAAACATAC AGAATGTTCT ATTGTCATTG GGAAATTTTG CTTTCTAACC 6240 

CAGTGGAGGT TAGAAAGAAG TTATATTCTG GTAGCAAATT AACTTTACAT CCTTTTTCCT 6300 

ACTTGTTATG GTTGTTTGGA CCGATAAGTG TGCTTAATCC TGAGGCAAAG TAGTGAATAT 6360 

6TTTTATATG TTATGAAGAA AAGAATTGTT GTAAGTTTTT GATTCTACTC TTATATGCTG 6420 

GACTGCATTC ACACATGGCA TGAAATAAGT CAGGTTCTTT ACAAATGGTA TTTTGATAGA 6480 

20 TACTGGATTG TGTTTGTGCC ATATTTGTGC CATTCCTTTA AGAACAATGT TGCAACACAT 6540 

TCATTTGGAT AAGTTGTGAT TTGACGACTG ATTTAAATAA AATATTTGCT TCACTTAAAA 6600 
AAAAAAAA 

Seq ID NO: 44 Protein sequence: 
25 Protein Accession #: NP_005091 



1 11 21 31 41 51 

3U MGAGSSTBQR SPEQPPEGSS TPAEPEPSGG GPSAEAAPDT TADPAXAASD PATKLLQKNG 60 

QLSTINGVAE QDELSLQEGD LNGQKGALNG QGALNSCEEE BVIVTEVGQR DSEDVSERDS 120 

DKEMATKSAV VHDITDDGQE ENRNIEQIPS SESNLEELTQ PTESQANDIG FKXVFKFVGF 180 

KFTVKKDKTE KPDTVQLLTV KKDEGEGAAG AGDHQDPSLG AGEAASKESB PKQSTEKPEE 240 

TLKREQSHAE ISPPAESGQA VEECKEEGEE KQEKEPSKSA ESPTSPVTSB TGSTFKKFPT 300 

35 QGWAGWRKKT SFRKPKEDEV EASEKKKEQE PEKVDTEEDG KAEVASEKLT ASEQAHPQEP 360 

ABSAHEPRLS AEYEKVELPS EEQVSGSQGP SEEKPAPLAT EVFDEKIEVH QEEWAEVHV 420 

STVEERTEEQ KTEVEETAGS VPAEELVGMD AEPQEAEPAK BLVKLKETCV SGEDPTQGAD 480 

LSPDEKVLSK PPEGWSEVE MLSSQERMKV QGSPLKKLFT STGLKKLSGK KQKGKRGGGD 540 

EESGEHTQVP ADSPDSQEEQ KGESSASSPB EPEElTCIiEK GLAEVQQDGE AEEGATSDGE 600 

40 KKREGVTPWA SPKKMVTPKK RVRRPSESDK EDELDKVKSA TLSSTESTAS EMQEEMKGSV 660 

EEPKPBEPKR KVDTSVSWEA LICVGSSKKR ARRRSSSDEB GGPKAMGGDH QKADEAGKDK 720 

ETGTDGILAG SQEHDPGQGS SSPEQAGSPT EGEGVSTWES FKRLVTPRKK SKSKLEEKSE 780 

DSIAGSGVEH STPDTEPGKE ESWVSIKKFI PGRRKKRPDG KQEQAPVEDA GPTGANEDDS 840 

DVPAWPLSE YDAVEREKME AQQAQKGAEQ PEQKAATEVS KELSESQVHM MAAAVADGTR 900 

45 AATIIEERSP SWISASVTEP LEQVEAEAAL LTEEVLEREV IAEBEPPTVT BPLPENREAR 960 

GDTWSEAEL TPEAVTAAET AGPLGSEEGT EASAAEETTE MVSAVSQLTD SPDTTEEATP 1020 

VQEVEGGVPD IBEQERRTQB VLQAVAEKVK EESQLPGTGG PEDVLQPVQR AEAERPEEQA 1080 

EA5GLKKETD WLKVDAQEA KTEPPTQGKV VGQTTPESFE KAPQVTESIB SSELVTTCQA 1140 

ETLAGVKSQB MVMEQAIPPD SVETPTDSET DGSTPVADFD APGTTQKDEI VEIHEENEVA 1200 

50 SGTQSGGTEA EAVPAQKERP PAPSSFVFQE ETKEQSKMED TLEHTDKEVS VETVSILSKT 1260 

EGTOEADQYA DEKTKDVPFF EGLEGSIDTG ITVSREKVTE VALKGEGTEE AECKKDDALE 1320 

LQSHAKSPPS PVEREMWQV EREKTEAEPT HVNEEKLEHE TAVTVSEEVS KQLLQTVNVP 1380 

IIDGAKEVSS LEGSPPPCLG QEEAVCTKIQ VQSSEASFTL TAAAEEEKVL GETANILETG 1440 

ETLEPAGAHL VLEEKSSEKN EDFAAHPGED AVPTGPDCQA KSTPVTVSAT TKKGLSSDLE 1500 

55 GEKTTSLKWK SDEVDEQVAC QEVKVSVAIE DLEPENGIXtE LETKSSKLVQ NIIQTAVDQF 1560 

VRTEETATEM LTSELQTQAH VIKADSQDAG QBTEKEGEEP QASAQDETPI TSAKEESEST 1620 

AVGQAHSDIS KDMSEASEKT MTVEVEGSTV NDQQLEEWL PSEEEGGGAG TKSVPEDDGH 1680 

AIiIAERIBKS LVEPKEDEKG DDVDDPENQN SALADTDASG GLTKESPDTN GPKQKEKEDA 1740 
QEVELQEGKV HSESDKAITP QAQEELQKQE RESAKSELTE S 

60 

Seq ID NO: 45 Nucleotide sequence: 
Nucleic Acid Accession #: NM_001290 

Coding sequence: 110. .1231 (underlined sequences correspond to start and stop codons) 

65 

1 * 11 21 31 41 51 

I I I I I I 

GTGAGCGTGT GTGOGTGCGT CTACTTTGTA CTGGGAAGAA CACAGCCCAT GTGCTCTGCA 
TGGACGTTAC TGATACTCTG TTTAGCTTGA TTTTCGAAAA GCAGGCAAGA TgTCCt&CAC 

70 ACCACATGAC CCCTTCTATT CTTCTCCTTT CGGCCCATTT TATAGGAGGC ATACACCATA 
CATGGTACAG CCAGAGTACC GAATCTATGA GATGAACAAG AGACTGCAGT CTCGCACAGA 
GGATAGTGAC AACCTCTGGT GGGACGCCTT TGCCACTGAA TTTTTTGAAG ATGACGCCAC 
ATTAACCCTT TCATTTTGTT TGGAAGATGG ACCAAAGCGA TACACTATCG GCAGGACCCT 
CATCCCCCGT TACTTTAGCA CTGTGTTTGA AGGAGGGGTG ACOGACCTGT ATTACATTCT 

75 CAAACACTCG AAAGAGTCAT ACCACAACTC ATCCATCACG GTGGACTGCG ACCAGTGTAC 
CATGGTCACC CAGCACGGGA AGCCCATGTT TACCAAGGTA TGTACAGAAG GCAGACTGAT 
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CTTGGAGTTC ACCTTTGATG ATCTCATGAG AATCAAAACA TGGCACTTTA CCATTAGACA 600 

ATACCGAGAG TTAGTCCCGA GAAGCATCCT AGCCATGCAT GCACAAGATC CTCAGGTCCT 660 

GGATCAGCTG TCCAAAAACA TCACCAGGAT GGGGCTAACA AACTTCACCC TCAACTACCT 720 

CAGGTTGTGT GTAATATTGG AGCCAATGCA GGAACTGATG TCGAGACATA AAACTTACAA 780 

CCTCAGTCCC CGAGACTGCC TGAAGACCTG CTTGTTTCAG AAGTGGCAGA GGATGGTGGC 840 

TCCGCCAGCA GAACCCACAA GGCAACCAAC AACCAAACGG AGAAAAAGGA AAAATTCCAC 900 

CAGCAGCACT TCCAACAGCA GCGCTGGGAA CAATGCAAAC AGCACTGGCA GCAAGAAGAA 960 

GACCACAGCT GCAAACCTGA GTCTGTCCAG TCAGGTACCT GATGTGATGG TGGTAGGAGA 1020 

GCCAACTCTG ATGGGAGGTG AGTTTGGGGA CGAGGACGAA AGGCTAATCA CTAGATTAGA 1080 

AAACACGCAA TATGATGCGG CCAACGGCAT GGACGACGAG GAGGACTTCA ACAATTCACC 1140 

CGCGCTGGGG AACAACAGCC CGTGGAACAG TAAACCTCCC GCCACTCAAG AGACCAAATC 1200 

AGAAAACCCC CCACCCCAGG CTTCCCAATA AGATGATCGG CACCAGAATC CACTGTCAAT 1260 

AGGCCCGTGG GTGATCATTA CAATTGCAAA TCTTTACTTA CAGGAGAGGA AACAGAAGAG 1320 

ATAAAAACTT TTCCATGCAA ATATCTATTT CTAAACCACA ATGATCTGAT TTTCTTTCTT 1380 

CTTTCTTTTT TTCTAATTGA GAGGATTATT CCCAGTAAGC TTCCATGACC CTTTCTTGGA 1440 

GGCCTTCACA GGTAATACAG ATACTGGCAC TGATTGTAAT TAAAATGAGA GAAAACTCTA 1500 

GCGCATCTTC TGGCACGGTT TTAACAACGT GTTTGTGTTG AATTTCCTTT TTATGCATCA 1560 

AACGAAGGCC ATATTGTCCA TAAATGCTCA GTGCTCAGGA TCTCATTAAT ATGCCGAACC 1620 

TAACTACAGA TGACTTTTTA ATATTGTAAA ATATTTTCTG CTTTTTGACT TGCATCTGAG 1680 

AGTTTCTTGT TTCAGTAAAA AAAGAAAAGA CAAAAAAATC AGCTTTGGAA AGTAATTTAA 1740 

ATGTACCTTA TTTTTTTTTT CTTTATGTTT TCTTTCATTG GGCAACAGCT AAGAGGGCCC 1800 

AGCAAGGTAA TTTATGGTTG AGCTGATGTC AATTGGTTCT TGTCTTGAGT CGACTCAATT 1860 

TAGCCCAAGT GCTGAAACAA GAAATGTCAT TTTTTTCATC AAAGACACCA GGGCAGATTT 1920 

TTAAGTAAAG AAAGACAATT GGACCCTTAA GAATTTATGC ATTTGTAAAG TTGCTGTTGA 1980 

TCCAAATATT TTCAAGCCAT GTAATCCATT GGTTTTGTGG GCAGTTTAAT AAACCTGAAC 2040 

CTTTGTGTGT TTTCTAATTG TACCTGAGTT GACCATCCTT TCTTTTTATA GTATATTTCT 2100 

TGTATGATAT TTTGTAAAGC TCTCACCTGG TTCTTTTATG GGGACTTTTC GTTTTTGGGC 2160 

AACTCCAGTG TATTTATGTG AAACTTTATA AGAGAATTAA TTTTTCCATT TGCATATTAA 2220 

TATGTTCCTC CACACATGTA AAGGCACAGT GGCTCCGTGT GTTAAAAAAC AGCTGTATTT 2280 
TATGTATGCT TTACTGATAA GTGTGCCAAT AATAAACTGT GTTAATGACC 



Seq ID NO: 46 Protein sequence; 
Protein Accession #: NPJJ01281 



1 11 21 31 41 51 

I I I I I I 

MSSTPHDPFY SSPFGPFYRR HTPYMVQPEY RIYEMNKRLQ SRTEDSDNLW WDAFATEFFE 60 
DDATLTT»SFC LEDGPKRYTI GRTLIPRYFS TVFEGGVTDL YYILKHSKBS YHNSSITVDC 120 
DQCTMVTQHG KPMFTKVCTE GRLILEFTFD DLMRIKTWHF TIRQYRELVP RSILAKHAQD 180 
PQVLDQLSKN ITRMGLTNFT LNYLRLCVTL EPMQELMSRH KTYNLSPRDC LKTCLFQKWQ 240 
RMVAPPAEPT RQPTTKRRKR KNSTSSTSNS SAGNNANSTG SKKKTTAANL SLSSQVPDVM 300 
WGEPTliMGG EPGDEDERLI TRLENTQYDA ANGMDDEEDF NNSPALGNNS PWNSKPPATQ 360 
ETKSENPPPQ ASQ 

Seq ID NO: 47 Nucleotide sequence; 
Nucleic Acid Accession #: NM_004126 

Coding sequence: 108.. 329 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTCG CCGCTCTTCC 60 

AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTGCCCTTC 120 

ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 240 

AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 

AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 

AGAACTAGTT TGTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 

TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 480 

GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT S40 

ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AAGTTTTGTC TT 

Seq ID NO: 48 Protein sequence: 
Protein Accession #: NP_004117 

1 11 21 31 41 51 

1 I I I I I 

MPAI*HIBDLP EKEKLKMEVE QLRKEVKLQR QQVSKCSEEI KNYIEERSGE DPLVKGIPED 60 
KNPFKEKGSC VIS 



Seq ID NO: 49 Nucleotide sequence: 
Nucleic Acid Accession #: XM_051896 

Coding sequence: 13 9.. 2388 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

1 I I I I I 

GTTTTAAAGA CGCTAGAGTG CCAAAGAAGA CTTTGAAGTG TGAAAACATT TCCTGTAATT 60 

GAAACCAAAA TGTCATTTAT AGATCCTTAC CAGCACATTA TAGT6GAGCA CCAGTATTCC 120 

CACAAGTTTA CGGTAGTGGT GTTACGTGCC ACCAAAGTGA CAAAGGGGGC CTTTGGTGAC 180 

ATGCTTGATA CTCCAGATCC CTATGTGGAA CTTTTTATCT CTACAACCCC TGACAGCAGG 240 

AAGAGAACAA GACATTTCAA TAATGACATA AACCCTGTGT GGAATGAGAC CTTTGAATTT 300 

ATTTTGGATC CTAATCAGGA AAATGTTTTG GAGATTAOGT TAATGGATGC CAATTATGTC 360 

ATGGATGAAA CTCTAGGGAC AGCAACATTT ACTGTATCTT CTATGAAGGT GGGAGAAAAG 420 

AAAGAAGTTC CTTTTATTTT CAACCAAGTC ACTGAAATGG TTCTAGAAAT GTCTCTTGAA 480 

GTTTGCTCAT GCCCAGACCT ACGATTTAGT ATGGCTCTGT GTGATCAGGA GAAGACTTTC 540 

AGACAACAGA GAAAAGAACA CATAAGGGAG AGCATGAAGA AACTCTTGGG TCCAAAGAAT 600 

AGTGAAGGAT TGCATTCTGC ACGTGATGTG CCTGTGGTAG CCATATTGGG TTCAGGTGGG 660 

GGTTTCCGAG CCATGGTGGG ATTCTCTGGT GTGATGAAGG CATTATACGA ATCAGGAATT 720 

CTGGATTGTG CTACCTAOGT TGCTGGTCTT TCTGGCTCCA CCTGGTATAT GTCAACCTTG 780 

TATTCTCACC CTGATTTTCC AGAGAAAGGG CCAGAGGAGA TTAATGAAGA ACTAATGAAA 840 

AATGTTAGCC ACAATCCCCT TTTACTTCTC ACACCACAGA AAGTTAAAAG ATATGTTGAG 900 

TCTTTATGGA AGAAGAAAAG CTCTGGACAA CCTGTCACCT TTACTGATAT CTTTGGGATG 960 

TTAATAGGAG AAACACTAAT TCATAATAGA ATGAATACTA CTCTGAGCAG TTTGAAGGAA 1020 

AAAGTTAATA CTGCACAATG CCCTTTACCT CTTTTCACCT GTCTTCATGT CAAACCTGAC 1080 

GTTTCAGAGC TGATGTTTGC AGASTGGGTT GAATTTAGTC CATACGAAAT TGGCATGGCT 1140 

AAATATGGTA CTTTTATGGC TCCCGACTTA TTTGGAAGCA AATTTTTTAT GGGAACAGTC 1200 

GTTAAGAAGT ATGAAGAAAA CCCCTTGCAT TTCTTAATGG GTGTCTGGGG CAGTGCCTTT 1260 

TCCATATTGT TCAACAGAGT TTTGGGCGTT TCTGGTTCAC AAAGCAGAGG CTCCACAATG 1320 

GAGGAAGAAT TAGAAAATAT TACCACAAAG CATATTGTGA GTAATGATAG CTCGGACAGT 1380 

GATGATGAAT CACACGAACC CAAAGGCACT GAAAATGAAG ATGCTGGAAG TGACTATCAA 1440 

AGTGATAATC AAGCAAGTTG GATTCATOGT ATGATAATGG CCTTGGTGAG TGATTCAGCT 1500 

TTATTCAATA CCAGAGAAGG ACGTGCTGGG AAGGTACACA ACTTCATGCT GGGCTTGAAT 1560 

CTCAATACAT CTTATCCACT GTCTCCTTTG AGTGACTTTG CCACACAGGA CTCCTTTGAT 1620 

GATGATGAAC TGGATGCAGC TGTAGCAGAT CCTGATGAAT TTGAGCGAAT ATATGAGCCT 1680 

CTGGATGTCA AAAGTAAAAA GATTCATGTA GTGGACAGTG GGCTCACATT TAACCTGCCG 1740 

TATCCCTTGA TACTGAGACC TCAGAGAGGG GTTGATCTCA TAATCTCCTT TGACTTTTCT 1800 

GCAAGGCCAA GTGACTCTAG TCCTCCGTTC AAGGAACTTC TACTTGCAGA AAAGTGGGCT 1860 

AAAATGAACA AGCTCCCCTT TCCAAAGATT GATCCTTATG TGTTTGATCG GGAAGGGCTG 1920 

AAGGAGTGCT ATGTCTTTAA ACCCAAGAAT CCTGATATGG AGAAAGATTG CCCAACCATC 1980 

ATCCACTTTG TTCTGGCCAA CATCAACTTC AGAAAGTACA GGGCTCCAGG TGTTCCAAGG 2040 

GAAACTGAGG AAGAGAAAGA AATCGCTGAC TTTGATATTT TTGATGACCC AGAATCACCA 2100 

TTTTCAACCT TCAATTTTCA ATATCCAAAT CAAGCATTCA AAAGACTACA TGATCTTATG 2160 

CACTTCAATA CTCTGAACAA CATTGATGTG ATAAAAGAAG CCATGGTTGA AAGCATTGAA 2220 

TATAGAAGAC AGAATCCATC TCGTTGCTCT GTTTCCCTTA GTAATGTTGA GGCAAGAAGA 2280 

TTTTTCAACA AGGAGTTTCT AAGTAAACCC AAAGCATAGT TCATGTACTG GAAATGGCAG 2340 

CAGTTTCTGA TGCTGAGGCA GTTTGCAATC CCATGACAAC TGGATTTAAA AGTACAGTAC 2400 

AGATAGTCGT ACTGATCATG AGAGACTGGC TGATACTCAA AGTTGCAGTT ACTTAGCTGC 2460 

ATGAGAATAA TACTATTATA AGTTAGGTTG ACAAATGATG TTGATTATGT AAGGATATAC 2520 

TTAGCTACAT TTTCAGTCAG TATGAACTJC CTGATACAAA TGTAGGGATA TATACTGTAT 2580 

TTTTAAACAT TTCTCACCAA CTTTCTTA TG TGTGTTCTTT TTAAAAATTT TTTTTCTTTT 2640 

AAAATATTTA ACAGTTCAAT CTCAATAAGA CCTCGCATTA TGTATGAATG TTATTCACTG 2700 

ACTAGATTTA TTCATACCAT GAGACAACAC TATTTTTATT TATATATGCA TATATATACA 2760 
TACATGAAAT AAATACATCA ATATAAAAAT 

Seq ID NO: 50 Protein sequence: 
Protein Accession #: XP_051896 

1 11 21 31 41 51 

I I I I I I 

MSFIDPYQHI rVEHQYSHKF TVWLRATKV TKGAFGDMLD TPDPYVELFI STTPDSRKRT 60 

RHFNNDINPV WNETFEFILD PNQENVLBIT LMDANYVMDE TLGTATPTVS SMKVGEKKEV 120 

PFIFNQVTEM VLEMSLEVCS CPDLRFSMAL CDQEKTFRQQ RKEHIRESMK KLLGPKNSBG 180 

LHSARDVPW AILGSGGGPR AMVGPSGVMK ALYESGILDC ATYVAGLSGS TWYMSTLYSH 240 

PDFPEKGPEE INEELMKNVS HNPLLLLTPQ KVKRYVBSLW KKKSSGQPVT FTDIFGMLIG 300 

" ETLIHNRMNT TLSSLiKEKVN TAQCPLPLFT CLHVKPDVSE LMFADWVBFS PYEIGMAKYG 360 

TFMAPDLFGS KFFMGTWKK YEENPLHFLM GVWGSAPSIL FNRVU5VSGS QSRGSTMEEE 420 

LENITTKHIV SNDSSDSDDE SHEPKGTENE DAGSDYQSDN QASWIHRMIM ALVSDSALFN 480 

TREGRAGKVH NFMLGLNLNT SYPLSPLSDF ATQDSFDDDE LDAAVADPDE FERIYEPLDV 540 

KSKKIHWDS GLTFNLPYPL ILRPQRGVDL IISFDFSARP SDSSPPFKEL LLAEKWAKMN 600 

KLPFPKIDPY VFDRBGLKEC YVFKPKNPDM EKDCPTIIHF VLANINFRKY KAPGVPRETE 660 

EEKEIADFDI FDDPESPFST FNFQYFNQAF KRLHDLMHFN TLNNIDVIKE AMVESIEYRR 720 
QNPSRCSVSL SNVEARRFFN KEFLSKPKA 



Seq ID NO: 51 Nucleotide sequence: 
Nucleic Acid Accession #: NM_006528 

Coding sequence: 57.. 764 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

I I I I I I 

GCCGCCAGCG GCTTTCTCGG ACGCCTTGCC CAGCGGGCCG CCCGACCCCC TGCACCATGG 60 

ACCCCGCTCG CCCCCTGGGG CTGTCGATTC TGCTGCTTTT CCTGACGGAG GCTGCACTGG 120 

GCGATGCTGC TCAGGAGCCA ACAGGAAATA ACGCGGAGAT CTGTCTCCTG CCCCTAGACT 180 

ACGGACCCTG CCGGGCCCTA CTTCTCCGTT ACTACTACGA CAGGTACACG CAGAGCTGCC 240 

GCCAGTTCCT GTACGGGGGC TGCGAGGGCA ACGCCAACAA TTTCTACACC TGGGAGGCTT 300 

GCGACGATGC TTGCTGGAGG ATAGAAAAAG TTCCCAAAGT TTGCCGGCTG CAAGTGAGTG 360 

TGGACGACCA GTGTGAGGGG TCCACAGAAA AGTATTTCTT TAATCTAAGT TCCATGACAT 420 

GTGAAAAATT CTTTTCCGGT GGGTGTCACC GGAACCGGAT TGAGAACAGG TTTCCAGATG 480 

AAGCTACTTG TATGGGCTTC TGCGCACCAA AGAAAATTCC ATCATTTTGC TACAGTCCAA 540 

AAGATGAGGG ACTGTGCTCT GCCAATGTGA CTCGCTATTA TTTTAATCCA AGATACAGAA 600 

CCTGTGATGC TTTCACCTAT ACTGGCTGTG GAGGGAATGA CAATAACTTT GTTAGCAGGG 660 

AGGATTGCAA ACGTGCATGT GCAAAAGCTT TGAAAAAGAA AAAGAAGATG CCAAAGCTTC 720 

GCTTTGCCAG TAGAATCCGG AAAATTCGGA AGAAGCAATT TTAAACATTC TTAATATGTC 780 

ATCTTGTTTG TCTTTATGGC TTATTTGCCT TTATGGTTGT ATCTGAAGAA TAATATGACA 840 

GCATGAGGAA ACAAATCATT GGTGATTTAT TCACCAGTTT TTATTAATAC AAGTCACTTT 900 

TTCAAAAATT TGGATTITTT TATATATAAC TAGCTGCTAT TCAAATGTGA GTCTACCATT 960 

TTTAATTTAT GGTTCAACTG TTTGTGAGAC GAATTCTTGC AATGCATAAG ATATAAAAGC 1020 

AAATATGACT CACTCATTTC TTGGGGTCGT ATTCCTGATT TCAGAAGAGG ATCATAACTG 1080 

AAACAACATA AGACAATATA ATCATGTGCT TTTAACATAT TTGAGAATAA AAAGGACTAG 1140 
CC 

Seq ID NO: 52 Protein sequence; 
Protein Accession #: NP_006519 

1 11 21 31 41 51 

I I I I I I 

MDPARPLGLS ILLLFLTEAA LGDAAQEPTG NNAEICLLPL DYGPCRATiTiL RYYYDRYTQS 60 

CRQPLYGGCE GNANNFYTWE ACDDACWRIE KVPKVCRLQV SVDDQCEGST EKYFFNLSSM 120 

TCEKFFSGGC HRNRIENRFP DEATCMGFCA PKKIPSFCYS PKDEGLCSAN VTRYYFNPRY 180 
RTCDAFTYTG CGGNDNNFVS REDCKRACAK ALKKKKKMPK LRFASRIRKI RKKQF 



Seq ID NO: 53 Nucleotide sequence: 
Nucleic Acid Accession #: AA478778 
Coding sequence: no ORF found 



1 11 21 

I I I 

TATTTTTGTA CGTAAAATGA TTCTATTATG 
TGATCCTTCA TTATCACGGT ACACTATTGT 
ACTTTTTTAA AATGAATTTT TTTAAAACAA 
GTATAAAAGA TATTTTTGGC ATTTCTAGGC 
TCACAGATTG TACCAACTAT TAACTATGTT 
GGAAAAAAAT ATGCTGCCTT GGTGCTAATA 
GAAATATAAA CACTTTTAAT GAAAGGGAGG 
CACTTGGATG AAATAAGACC AGCTCTTTAC 
AGACTTAGAC TTTATCCTTA TTGTTGTTAG 
GTGCCTTGGT CTCTCCACAA TCAAATGGAG 
TATTGGGAAA GTGAGATCCT CTCACCATTT 
TTACCAGTAG AAAGACACAG GATGCACAGA 
CTGGAGAAAT TCAGAACCAG GTTCTGAATC 
GCTGGTGATG TGACTTCTCT TCAGGCCATG 
CTGCAGTAAT GGACGTTTGT GTGAAGAAAT 
TTCCGATTGC TCATTAATTC ACTTTTTTGT 
CATGGTCTTT CTGCCCCTCC AAGCTGATGA 
ACACTTGGTT TGAGAAACCC TGCCCACTTC 
CAGTA1TCTC CAACTCCAAA CAAGCTCTAG 
TGAATAAGTG TTATTCTCCA TTATTAATGT 
CACCACACCC AAAAAAAAAA AAAAAAAAAA 



31 41 51 

I I I 

ACTGCCTTTG CATGTAGTAA TATGACAAAG 60 

TTACTTTTCA TCTGTAAATG TTTTATTGTT 120 

TCTAGCCATC ATCAAGGTGC TATAAGAGTT 180 

AAGTATCAGC CAATAAGTAT GTTAGTGATA 240 

AAATAAGTAT TCAGTTTCAT GTGATCTCTG 300 

TTGTATGTAT TTAAATGATC ATCTGACTCA 360 

AACGGAAGGA CAATTTCCAG TGCACAGAAT 420 

CCTTATTTTT GGATATGCCT TTTTTGGAAG 480 

TGTTGTTAAT ATTCGTTGCT TCAGCCCACG 540 

GATCCCCCAA GCAGCTTCAT TACAGAGTGA 600 

TGCCAAGATA CTCTAAAATG ACATCCAAGT 660 

ATGGGCATGA CCTTCAGCTC ACGAGCACAC 720 

ATCACGATTG CCTTTTGCAT GAAAACATCG 780 

AGCCTAACAY CCTGCCGGTT TTCATGCCCG 840 

GAACTGTGGA GTACAAAATG CTTTGAGTCT 900 

TACTTCTTTC CAAAATGGAA GTGCTGAAGC 960 

AGGGAAGCCT TTGCCAATGG CCCATGGAAG 1020 

CAAAGACCAA AGAGATTAGG AAAAGCCTGG 1080 

AGTGCTCCAG GAAAAGTTAT ATTCAGTATA 1140 

GTTCTGAAAA TATATTATGA ATAAATACAT 1200 
AAAA 



Seq ID NO: 54 Nucleotide sequence: 
Nucleic Acid Accession #: NM_020663 

Coding sequence: 1..645 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGAACTGCA AAGAGGGAAC TGACAGCAGC TGCGGCTGCA GGGGCAACGA CGAGAAGAAG 60 

ATGTTGAAGT GTGTGGTGGT GGGGGACGGT GCCGTGGGGA AAACCTGCCT GCTGATGAGC 120 

TACGCCAACG ACGCCTTCCC AGAGGAATAC GTGCCCACTG TGTTTGACCA CTATGCAGTT 180 

ACTGTGACTG TCGGAGGCAA GCAACACTTG CTCGGACTGT ATCACACCGC GGGACAGGAG 240 

GACTACAACC AGCTGAGGCC ACTCTCCTAC CCCAACACGG ATGTGTTTTT GATCTGCTTC 300 
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TCTGTCGTAA ACCCTGCCTC TTACCACAAT GTCCAGGAGG AATGGGTCCC CGAGCTCAAG 360 

GACTGCATGC CTCACGTGCC TTATGTCCTC ATAGGGACCC AGATTGATCT CCGTGATGAC 420 

CCAAAAACCT TGGCCCGTTT GCTGTATATG AAAGAGAAAC CTCTCACTTA CGAGCATGGT 4 BO 

GTGAAGCTCG CAAAAGCGAT CGGAGCACAG TGCTACTTGG AATGTTCAGC TCTGACTCAG 540 

AAAGGTCTCA AAGCGGTTTT TGATGAA6CA ATCCTCACCA TTTTCCACCC CAAGAAAAAG 600 
AAGAAACGCT GTTCTGAGGG TCACAGCTGC TGTTCAATTA TCTGA 



Seq ID NO: 55 Protein sequence: 
Protein Accession #: NP_065714 

1 11 21 31 41 51 

I I I I I I 

MNCKEGTDSS CGCRGNDEKK MLKCWVGDG AVGKTCLLMS YANDAFPEEY VPTVFDHYAV 60 
TVTVGGKQHL LGLYDTAGQS DYNQLRPLSY PNTDVFLICF SWNPASYHN VQEEWVPELK 120 
DCMPHVPYVL IGTQIDLRDD PKTLARXxLYM KEKPLTYEHG VKLAKAIGAQ CYLECSALTQ 180 
KGLKAVPDEA ILTIPHPKKK KKRCSEGHSC CSII 



Seq ID NO: 56 Nucleotide sequence: 

Nucleic Acid Accession #: fgenesh prediction 

Coding sequence: 1-546 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

ATGGCCTTGG GCAGCTCCGC CCCTGTGGCT 
TTCATGGCTG GCATTAAGTG TCTGTGGCTT 
TTGGTGCAAA GGCTCCTGGG TGGAGCTCGA 
CAGCTCGCCG GTGCCCTCGA CCTGCCCGCT 
GGCTTTGACT CCGTGTTGGC CTCTCTGCCG 
TTCTGGTCCT CAGGAGACAT GTCTGACTGG 
CATTCTCCTC TGAGCACTCC AGGGTGGAGC 
CAGTTTGTCA AAGGCCAGAA CTTGGACGTA 
AAACCCTTTG AAACTGGTTC CATGGTTCCA 
AAG TAG 



31 41 51 

I I I 

TTGCAGGGTA ATGCCCACTT CCCTGCTGCT 60 

TTCCAGGTAG TCCCCCTGGG GCTCCCCGAG 120 

ACTGAAACTC G CTTTG TGCC CGCAGCCCTG 180 

GGGTCCTGTG CCTTTGAAGA GAGCACTTGC 240 

TGGATTTTAA ATGAGGAAGG CCAGCAACCT 300 

GACTACTGGG TTGGCTGGCG GAAGTTAATT 360 

AGGCAGGTTA GGCTCCAGTT GTTCCAGCTT 420 

ACAGTCTACT GCAGGCTCCA GGGCAGTGAG 480 

TTCACCTTCA TGTACTGGAT CCACCATGGA 540 



Seg ID NO: 57 Protein sequence: 

Protein Accession #: fgenesh prediction 



1 11 21 31 41 51 

I I 1 I I I 

MALGSSAPVA LQGNAHFPAA FMAGIKCLWL FQWPLGLPE LVQRLLGGAR TETRFVPAAL 
QLAGALDLPA GSCAFEESTC GFDSVLASIiP WILNEEGQQP FWSSGDMSDW DYWVGWRKLI 
HSPLSTPGWS RQVRLQLFQL QFVKGQNLDV TVYCRLQGSS KPPETGSMVP PTFMYWIHHG 
K 



60 
120 
180 



Seq ID NO: 58 Nucleotide sequence: 
Nucleic Acid Accession #: XMJJ50478 

Coding sequence: 27.. 4508 (underlined sequences correspond to start and stop codons) 

I 11 21 31 41 51 

II I I I I 

CCGGCGGCGC CTGAGCCCAG CCGAGGATGG AGAACCGGCC TGGGTCCTTC CAGTACGTCC 60 

CTGTGCAGCT GCAAGGGGGG GCACCCTGGG GCTTCACCCT TAAGGGGGGT CTGGAACACT 120 

GTGAGCCGCT CACAGTGTCT AAGATTGAAG ATGGAGGCAA GGCAGCTTTG TCCCAGAAGA 180 

TGAGGACTGG TGATGAGCTG GTGAATATCA ATGGCACTCC ATTATATGGC TCCCGCCAAG 240 

AGGCCCTCAT TCTCATCAAA GGCTCCTTCC GGATTCTCAA GCTGATTGTC AGGAGGAGGA 300 

ACGCCCCTGT CAGTAGGCCG CACTCATGGC ATGTGGCCAA GCTGCTGGAG GGATGCCCTG 360 

AAGCAGCCAC CACCATGCAT TTCCCTTCTG AAGCCTTCAG CTTGTCCTGG CATTCTGGCT 420 

GCAACACAAG TGACGTGTGT GTGCAGTGGT GTCCACTCTC CCGGCATTGC AGCACCGAGA 480 

AAAGCAGCTC CATTGGCAGC ATGGAGAGCC TGGAGCAACC AGGCCAAGCC ACCTATGAGA 540 

GCCATCTGTT GCCTATTGAC CAGAACATGT ACCCTAACCA GCGTGACTCA GCCTACAGCT 600 

CCTTCTCGGC CAGCTCAAAT GCTTCTGACT GTGCCCTTTC CCTCAGGCCA GAGGAGCCAG 660 

CCTCTACAGA CTGCATCATG CAAGGCCCAG GGCCAACTAA GGCCCCCAGT GGCCGGCCTA 720 

ATGTGGCTGA GACCTCAGGA GGTAGTCGGC GCACCAATGG GGGCCACCTG ACCCCCAGCT 780 

CTCAGATGTC ATCCCGTCCA CAGGAGGGAT ACCAGTCAGG GCCCGCCAAA GCAGTCAGGG 840 

GCCCACCACA ACCTCCAGTG AGGCGGGACA GCCTTCAGGC CTCCAGAGCC CAACTCCTCA 900 

ATGGAGAGCA GCGCAGGGCA TCTGAGCCTG TGGTCCCCTT GCCACAGAAG GAGAAACTGA 960 

GCTTAGAGCC TGTGCTACCC GCAAGGAACC CTAATAGGTT CTGTTGCCTC AGTGGGCATG 1020 

ACCAAGTGAC AAGTGAGGGC CATCAGAACT GTGAGTTCAG TCAGCCTCCT GAATCCAGCC 1080 

AACAGGGCTC TGAGCATCTA CTGATGCAGG CCTCAACCAA AGCTGTTGGA TCCCCAAAAG 1140 

CCTGTGACAG AGCTTCCAGC GTGGATTCCA ACCCACTCAA TGAGGCTTCT GCAGAGCTAG 1200 

CTAAGGCTTC TTTTGGCAGA CCTCCACATC TCATAGGACC CACAGGGCAT CGCCATAGTG 1260 

CCCCTCAACA GCTGGTGGCA TCOCACCTGC AGCATGTGCA CCTTGATACC AGGGGCAGCA 1320 

AAGGGATGGA GCTCCCACCC GTACAGGATG GGCACCAGTG GACTCTGTCC CCTTTGCACA 1380 
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GCAGCCACAA AGGGAAGAAA AGTCCATGCC 
GCAAAGAAAG AAAGACCAGA CAAGTGGATG 
AAAGCAGTCC CCCACATGGA GAGGCTGATG 
CAAACAGAAC AAGCAGAGCA GCCAGTGAAT 
5 CCCTTGTTCA ACAAGCCACG GACTGTTCTT 
CAGGTGAAGA AGGGGACAGC GAGCCCAAGG 
GAGGGACCCG GGGCCGCTCG ATCCAAAACC 
TGCGTAATGA AATTCAGAGG AGGAAGGCCC 
AGCTGTGTGA CACTAAGGAG CCAGTGGAAG 

10 TCACTGCCTC TAACACATCT CTTCTATCTT 
AGCTCTTCAA CAAAAGCATG ATGCTCAGAG 
CTGAGAGCCA TGAATCTAGG ACAGGCTTAG 
GCCAGTCCTC TTTGGGCCTG AACACCTGGT 
CTGAGAAAGC ACATGCTCAC TGTGGAGTCC 

15 ATAATTCACA GCCACTTGTG GCAGCAGCCA 
AGGAATTGAA GGCTTCTACT GCTCAAGCTG 
ACAGAAGAAA GTTCTTTGAA GAGAGTAGCA 
TAACCACTCA TAGCAACAAG ACTTTTACCC 
AGCCAATGAG CTCCAGCTGT AGGGAATTGA 

20 CCGCAGACCA ACCATATCAT GCCACAGACC 
CAGAAACTCC CACTTACTCA GAATGTTTTG 
GTAAGCCACT ACACTGTGGT GATTTTGATT 
TTCAAGGAGC TCTAGTCCAT GATCCTTGCA 
TGCTAAAGAG AAATATGATG CCAAATTGCT 

25 TTCGGTGTTC AGTTTGCTAT CATAATCCTC 
CACCTGGCAA CACTTGGAAA CCCAGGAAGC 
GGAATCCAAT AACAGGAAAC AGGAAGACCA 
AGACTAGCTT TTCATGGGCA ACCCCTTTCC 
TGTCAAGCTA CCGAGCAATT TCTTCTCTTG 

30 AAAAATCAGA GGAAACTTCA GTTTATGAGG 
CACTGCGCAG CCGTGCCTTC TCAGAGAGTC 
CCTGGGGGCA GCATAGGAGG GAGCTCTTTA 
TCGGAGCCAG GAAGAAGGCC TTTCCTCCTC 
ACAGGCTCTT TCGTGCAGCC CAGCAGCAGA 

35 AGGAGGAGGA GGAGGAGGAA GAAGAAGAAG 
CAGAGGAGGA GGAAGAGGAG CTGCCACCCC 
GTGCTCTCAA TCCTGAGGAG GTCCTAGAGC 
AGGGCTCGAG ACAGGGTTCA CAAAGTGTCC 
CCAGTGATTT CTTGCCTCCA ATAAGGGGTC 

40 CCCCTTGCTA CTATGGCATT GGTGGGCTTT 
CCGCCAAACA AGAGTTTCAG CACTTTTCGC 
CTTACTCAGC TTATTACAAT ATTTCTGTGG 
ACCAACCTGA GATGGCAGAG ATTGGCCTAG 
AAAAAAAGAT ACAGCTTATC GAAAGCATCA 

45 AGCGAGGGCT GCTAGAGGAC ATCAATGCCA 
ACTTAAAAGC CGTCTGCAAA TCCAATGAAT 
TGGACAAAGT GGTCAACCTG TTGCTGTCAC 
CTCTGAACAG CATCGATTCA GAGGCCAACC 
AGCAGCTGAC GGGGCAGTTG GCAGATGCCA 

50 AGAAGTTGGT GTTTGGCATG GTCTCCCGCT 
AGCACTTTGT CAAGATGAAA TCTGCTCTCA 
TCAAGCTCGG GGAAGAGCAA CTCAAATGTC 
ATTTCTAATT CTACCAGCAC TCTGCCACAG 
TCAATCTTCT TTGTTAGCAG TTTCTCAGCA 

55 CCTCTACCCT GGATGTCTCT CACTACCCCT 
ACCCTGGGGA AGCCACAAGC TTCTACCCAA 
CACACTCTCC TTCCCACAGT TGCCAAGGGC 
TGCCTTCATT CTGCTTTGTA CTAGGACACC 
ATCATCAACA GCCTCTAAAG GCTCAGAGGG 

60 GGCTTGTGGC CAGCCATTTC TCACAGAGAG 
CAGTTTCAGG GCCTCACCCA AGCTTTGCAG 
AAAAAATGCA AGCAAAGGTT GAGTACCCCC 
ATAGGCTCTA CCCTTACCTT TCCCAGCAGC 
CTGGCTAGTG TGACCCTCTT CCTGTCCTAA 

65 TTTCCTTTAC ATTGCTGGGG GTTACCGCAG 
CATTAATAGC TCTACTAAAA CTGACTTCTA 
TCTTATTGTT ATATTTTAAA TGGCCTTTTG 
TTTTCTTTTT TAACTAATAA GGCGAGAAGA 
AAGGAAAGCA TTTTCTGCAG ATCAGCCTGA 

70 TCTCGTGTTG CTCACAACTA CCTGCCTGGA 
TAAAACACAA GATCAAATGA ACAATCCGAA 
CATGGTGGCT CACGCCTGAA ATCCCAGCAC 
GTCAGGAGAT CAAGACCATC CTGGCTAACA 
AAAAATTAGC CAGGTGTGGT GGCACGCACC 

75 AGGAGAATTG CTTGAACCTG GAAGGCAGAG 
CTCCATCCTG GGCAACAGAG TGAGACTTTG 



CCCCTACAGG AGGAACCCAT GACCAGTCCA 1440 

ACAGGTCTTT AGTTTTGGGA CACCAGAGCC 1500 

GACACCCCTC AGAAAAAGGT TTCCTGGACC 1560 

TGGCCAACCA GCAACCCTCT GCCTCTGGCT 1620 

CAACCACTAA AGCAGCTAGT GGCACAGAGG 1680 

AGTGCAGCCG GATGGGTGGT AGGOGAAGTG 1740 

GGCGGAAGAG TGAGCGTTTT GCTACCAATC 1800 

AGCTCCAGAA AAGCAAGGGT CCCTTGTCAC I860 

AGACCCAGGA GCCCCCAGAA AGTCCTCCAC 1920 

CATGTAAAAA ACCTCCCAGC CCCAGAGACA 1980 

CTAGGTCTTC OGAGTGCCTC AGCCAAGCCC 2040 

AGGGACGAAT AAGCCCTGGC CAGAGGCCTG 2100 

GGAAAGCACC TGACCCATCC TCCTCAGACC 2160 

GTGGAGGTCA TTGGAGATGG TCTCCAGAGC 2220 

TGGAAGGCCC TTCCAACCCA GGTGACAACA 2280 

GGGAGGATGC CATCCTCTTG CCTTTTGCAG 2340 

AATCCTTATC TACATCTCAT TTGCCAGGTT 2400 

AGAGACCAAA ACCTATAGAC CAAAACTTCC 2460 

GGCGCCATCC CATGGACCAA TCATATCATT 2520 

AATCATATCA TTCCATGTCA CCCCTTCAGT 2580 

CAAGCAAAGG TCTAGAAAAT TCCATGTGTT 2640 

ACCACAGGAC CTGCTCTTAC TCCTGCAGTG 2700 

TTTATTGTTC TGGGGAAATC TGCCCTGCCT 2760 

ACAACTGCCG GTGCCACCAC CACCAATGCA 2820 

AGCACAGTGC CCTCGAGGAC AGCAGCTTGG 2880 

TGACAGTGCA GGAATTTCCT GGGGACAAAT 2940 

GCCAGTCAGG GAGGGAAATG GCTCATTCCA 3000 

ATCCTTGCCT TGAGAACCCA GCACTGGACT 3060 

ACCTCCTTGG AGACTTCAAA CATGCTTTGA 3120 

AGGGGAGCTC CCTTGCCTCC ATGCCCCACC 3180 

ACATCAGCTT GGCGCCCCAA AGCACCCGGG 3240 

GCAAAGGTGA TGAGACCCAG TCGGATCTTC 3300 

CTCGCCCTCC TCCTCCCAAC TGGGAGAAGT 3360 

AGCAGCAACA GCAGCAGCAG AAGCAACAGG 3420 

AAGAGGAAGA GGAAGAGGAG GAGGAGGAGG 3480 

AGTATTTCAG TTCAGAAACC TCTGGTTCCT 3540 

AGCCACAACC CCTCAGCTTT GGCCACCTGG 3600 

CAGCAGAGCA AGAATCCTTT GCACTCCATT 3660 

ACTTGGGATC TCAACCTGAG CAGGCTCAGC 3720 

GGAGGACATC GGGACAGGAA GCCACTGAAT 3780 

CTCCTTCAGG GGCCCCAGGA ATCCCTACCT 3840 

CCAAGGCAGA GCTGCTGAAC AAACTGAAAG 3900 

GAGAGGAGGA AGTTGACCAT GAACTGGCTC 3960 

GCAGAAAACT TTCTGTCTTG CGGGAGGCCC 4020 

ATTCTGCCCT TGGGGAGGAG GTGGAGGCCA 4080 

TTGAAAAGTA CCACTTGTTT GTTGGGGACC 4140 

TCTCTGGACG ACTGGCCCGG GTGGAGAATG 4200 

AGGAGAAGTT GGTACTGATA GAGAAGAAGC 4260 

AGGAGCTGAA GGAGCACGTG GACCGCCGGG 4320 

ACCTGCCTCA GGACCAGCTC CAAGATTACC 4380 

TCATTGAACA GCGAGAGCTG GAGGAGAAGA 4440 

TCAGGGAGAG TCTACTCCTG GGGCCCAGCA 4500 

CATCCCTGCC CAGCCATGTG GGAAGTGCTT 4560 

AGTAGATAGC AATTAGCAGT TTGTTCCAGC 4620 

TCCCTAGCAG TGGTCCTAAC CAGCTAGGAG 4680 

GGGAGCTGCA GCAAGGTGTG ATCTTAGAAC 4740 

AAGTACTTGC TGCACAGAGA ACCAAGGAAG 4800 

AAAGACATCA AGTACTCATC ACCCACCCAT 4860 

AATCTGCCTT GCAGCTCTAC TCTGCCCCAG 4920 

CXGGCTGCCT TGAGGGCATT CACCTGGCAC 4980 

GGGAAAGCAC AGAGGGAGGA ATTACACTGA 5040 

AGGTGCCCCT TAGGAAGGAA CCAGGTTTAA 5100 

AAGTTCAGGG GAAGAGGCCT ACTCTTAGCC 5160 

GACTTTGGTC CTACCACCTC TTGTTTCATC 5220 

GTGCCTACCC CAGGGCTTCA CCATATGGGC 5280 

GATGTAGGTT TCATTATTGG GGGAGGGGGT 5340 

ATTTTATTTA TTTTTATGTT TTGATTATTT 5400 

GGGAAGTTGG AGAGGGAAAA GTTAGCCCAG 5460 

ATCCACOGTG GCTAGGCATA TTCTTGCTCT 5520 

TGAATTTAGG AAAGTTGCAG GATACAAGGT 5560 

AATGTTATTA AGAAAACAGT TCCGGCCGGG 5640 

TTTGGGAGGC CGAGGCAGGT GGATCACGAG 5700 

CGGTGAAACC CTATCTCTAC TAAAAATACA 5760 

AGTAGTCCCA GCTACTCGGG AGGCTGAGGC 5820 

ATTGCAGTGA GCTGAGACCA CACCACTGCA 5880 

TCTCAAAAAG AAAGAAAGAA AGAAAGAAAG 5940 
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AAAGAAAGAA AGAAAAGAAA GAAAGAAAGA AAGAAAGAAA ACAGTTCCAT TTACAATAGC 6000 
ATC 

Seq ID NO: 59 Protein sequence: 
Protein Accession #: XPJ350478 

1 11 21 31 41 51 

I I I I I I 

MENRPGSFQY VPVQLQGGAP WGPTLKGGLE HCBPLTVSKI EDGGKAALSQ KMRTGDELVN 60 

INGTPLYGSR QEALILIKGS FRIIiKLIVRR RNAPVSRPHS WHVAKLLEGC PEAATTMHFP 120 

SEAPSLSWHS GCNTSDVCVQ WCPLSRHCST BKSSSIGSME SLEQPGQATY ESHLLPIDQN 180 

MYPNQRDSAY SSPSASSNAS DCALSLRPEB PASTDCIMQG PGPTKAPSGR PNVAETSGGS 240 

RRTNGGHLTP SSQMSSRPQE GYQSGPAKAV RGPPQPPVRR DSLQASRAQL LNGEQRRASE 300 

PVVPLPQKEK LSLSPVIxPAR NPNRFCCLSG HDQVTSEGHQ NCEFSQPPES SQQGSEHLLM 360 

QASTKAVGSP KACDRASSVD SNPLNEASAS LAKASFGRPP HLIGPTGHRH SAPEQLLASH 420 

LQHVHLDTRG SKGMELPPVQ DGHQWTLSPL HSSHKGKKSP CPPTGGTHDQ SSKERKTRQV 480 

DDRSLVLGHQ SQSSPPHGEA DGHPSEKGFL DPNRTSRAAS ELANQQPSAS GSLVQQATDC 540 

SSTTKAASGT EAGEEGDSEP KECSRMGGRR SGGTRGRSIQ NRRKSERFAT NLRNEIQRRK 600 

AQLQKSKGPL SQLCDTKEPV EETQEPPESP PLTASNTSLL SSCKKPPSPR DKLFNKSKML 660 

RARSSBCLSQ APESHESRTG LBGRISPGQR PGQSSLGLNT WWKAPDPSSS DPEKAHAHCG 720 

VRGGHWRWSP EHNSQPLVAA AMEGPSNPGD NKELKASTAQ AGEDAILLPF ADRRKFFEES 780 

SKSLSTSHLP GLTTHSNKTF TQRPKPIDQN FQPMSSSCRE LRRHPMDQSY HSADQPYHAT 840 

DQSYHSMSPL QSETPTYSEC FASKGLENSM CCKPLHCGDF DYHRTCSYSC SVQGALVHDP 900 

CIYCSGEICP AIiLKRNMMPN CYNCRCHHHQ CIRCSVCYHN PQHSALEDSS LAPGNTWKPR 960 

KLTVQEFPGD KWNPITGNRK TSQSGREMAH SKTSFSWATP FHPCLENPAL DIiSSYRAISS 1020 

LDLLGDFKHA LKKSEETSVY EEGSSLASMP HPLRSRAFSE SHISLAPQST RAWGQHRREL 1080 

FSKGDETQSD LLGARKKAFP PPRPPPPNWE KYRLFRAAQQ QKQQQQQQKQ QEEEEEEEEE 1140 

E2EEEEEEEB EAEEEEEELP PQYFSSETSG SCALNPEEVL EQPQPLSFGH LBGSRQGSQS 1200 

VPAEQESFAL HSSDFLPPIR GHLGSQPEQA QPPCYYGIGG LWRTSGQEAT ESAKQEFQHF 1260 

SPPSGAPGIP TSYSAYYNIS VAKAELLNKL KDQPEMAEIG LGEEEVDHBL AQKKIQLIES 1320 

ISRKLSVLRE AQRGLLEDIN ANSALGEEVE ANLKAVCKSN EFEKYHLFVG DLDKWNLLL 1380 

SLSGRIiARVB NALNSIDSEA NQEKLVLIEK KQQLTGQLAD AKELKEHVDR REKLVFGMVS 1440 
RYLPQDQLQD YQHFVKMKSA LIIEQRELEE KIKLGEEQLK CLRESLLLGP SNF ' 



Seq ID NO: 60 Nucleotide sequence: 
Nucleic Acid Accession #: NM_014705 

Coding sequence: 192.. 2489 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GGGAGAAGCT AGGAAAAAAT GTCTTTGAGC TGTGAGATGC TTGTATATTT TGAAAATATG 60 

ATTATATGCA TGTGTTTGTA TTTTATGACT TGGATAATCT GAAAATCAAT TTGCTTTGTC 120 

AATGCTTCCT GGATTAGAAT TCCACTATTT GGTCCCTATC CTAGTCTACT AAAGAAAATT 180 

GAGCGGGAAA CATGGCGGGA AAGTGGCGTT TCATTAATTG CTACTGTAAC TCGTCTAATG 240 

GAGAGGTTGT TAGATTACAG AACTTCTATA AGACTGAACT GAACAAGGAG GAGATGTATA 300 

TACGCTACAT TCACAAACTC TATGATCTGC ATCTCAAAGC ACAGAACTTT ACAGAAGCTG 360 

CATATACCCT CCTCTTATAT GACGAGCTAC TGGAATGGTC TGATCGGCCC CTCAGGGAGT 420 

TCCTGACCTA CCCCATGCAA ACAGAATGGC AGCGCAAAGA GCACCTGCAC CTCACCATCA 480 

TCCAGAACTT TGACAGAGGC AAATGTTGGG AGAATGGCAT TATCTTGTGC CGGAAGATTG 540 

CAGAGCAGTA TGAGAGTTAT TATGACTACA GAAACCTGAG CAAGATGCGG ATGATGGAAG 600 

CCTCTTTGTA TGACAAAATT ATGGACCAGC AACGTCTTGA ACCAGAGTTC TTCAGAGTTG 660 

GATTTTATGG AAAAAAATTT CCATTTTTCT TAAGAAATAA GGAGTTTGTG TGTCGAGGGC 720 

ATGACTACGA GAGGCTGGAA GCCTTCCAAC AGAGAATGCT GAACGAGTTC CCCCATGCCA 780 

TCGCCATGCA GCACGCCAAC CAGCCCGATG AGACCATCTT CCAGGCAGAA GCTCAGTATT 840 

TGCAGATATA TGCTGTGACT CCCATTCCAG AGAGCCAGGA GGTCCTGCAG AGAGAGGGTG 900 

TTCCGGACAA CATCAAAAGC TTCTATAAAG TGAATCACAT CTGGAAATTC CGCTATGACC 960 

GACCATTTCA CAAAGGCACA AAAGATAAAG AGAATGAATT CAAGAGTCTC TGGGTGGAGA 1020 

GAACGTCATT ATACTTGGTG CAGAGTTTGC CTGGCATCTC TCGCTGGTTT GAAGTGGAAA 1080 

AGCGTGAAGT GGTAGAAATG AGTCCTCTGG AAAATGCAAT TGAAGTGCTA GAAAATAAGA 1140 

ATCAGCAGCT GAAGACTCTG ATTAGTCAGT GTCAGACAAG ACAGATGCAG AATATTAATC 1200 

CCCTGACTAT GTGCCTGAAT GGAGTTATAG ATGCTGCAGT TAATGGTGGC GTTTCCAGGT 1260 

ATCAAGAGGC ATTCTTTGTC AAAGAATATA TCTTAAGTCA CCCTGAAGAT GGGGAGAAAA 1320 

TTGCACGATT AAGAGAGCTG ATGCTTGAGC AGGCACAGAT TCTGGAATTT GGTTTGGCCG 1380 

TGCATGAGAA GTTTGTACCT CAAGATATGA GACCCCTTCA CAAAAAGCTG GTTGACCAAT 1440 

TCTTTGTGAT GAAGTCGAGC TTAGGGATAC AGGAGTTCTC TGCTTGTATG CAAGCCAGTC 1500 

CTGTCCATTT TCCTAATGGA AGCCCTCGTG TGTGTAGAAA CTCAGCACCT GCTTCTGTGA 1560 

GCCCAGATGG TACCAGGGTA ATTCCTAGAC GCAGCCCGTT AAGTTACCCA GCTGTCAACC 1620 

GATATTCTTC CTCCTCACTG TCCTCACAAG CTTCTGCTGA AGTAAGCAAT ATTACAGGGC 1680 

AATCAGAAAG CTCTGATGAA GTCTTTAACA TGCAGCCAAG TCCATCTACC TCAAGCTTGA 1740 

GTTCTACTCA CTCGGCTTCA CCTAATGTGA CAAGTTCTGC TCCATCGAGT GCCAGAGCTT 1800 

CTCCTTTGTT GTCTGACAAA CACAAACATT CCCGAGAAAA CTCTTGCCTG TCACCAAGAG I860 

AGAGACCATG CAGTGCCATC TATCCAACAC CTGTGGAGCC TTCGCAGAGG ATGCTGTTTA 1920 

ATCATATTGG AGACGGGGCC TTGCCACGCA GTGACCCAAA TCTCTCTGCA CCTGAAAAAG 1980 

CTTCACCAGC AAGACACACG ACATCAGTAT CCCCCTCGCC TGCCGGGCGA TCTCCATTGA 2040 
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AGGGCTCTGT 
CCAACTCCCC 
GCACGTCGGA 
TGCCAGTGCC 
AGACTCCGCC 
ACAGCCTCTC 
CGCGATCCAG 
CCCTGCCCCG 
TTGCCCGTTT 
CTTACTCAGC 
ATATTAAAAG 
AAAACGTTTC 
GTTCACTTTT 
TGTGTGACAA 
TCAGTGCACA 
TTTATCAGTG 
GGGGTTGCAA 
GTTGCTTTTG 
CTAAATCCCT 
TGCTTCAGAG 
ATTTTGCGCT 
TGCTGAAGTA 
TACAGACAGT 
TTAGCACGAT 
CACAAAATAA 
CCTGTCGTCT 
ATAGCATCTT 
CTTGTATGTA 
TCAATTGACA 
TGTTGGGAAG 
AGTGACCACT 
TGGTTATAGA 
TAATAAGAGA 
CAGTATGATT 
ATGTGCTTTT 
ATATGTTAAA 
TTTAGCTTTG 
GATAACACTT 
TATTTCTGTT 
AACAAAGCTT 
GGGCACATTA 
AGTGAGCAGT 
AGCAGAACAT 
TAAATATTAA 



GCAGTCTTTC 
TGTCTTGTCG 
AACCTCAGGC 
GGTGCCCGTG 
CCCGTACAGC 
CATCCCCGTC 
CCACCTGGAG 
CAAGGTCTCT 
ACAAAATAAG 
TCCTTCGATG 
TTGCTGATCT 
TCATTTGGAA 
TTTTATATAG 
TGTTGTATCG 
TCCTAACACA 
CAGATCATCA 
AGCTTGGGAC 
CTGTTTGACT 
TACTACCCTG 
TTCCAATCAG 
TCCTTTCTTG 
TTAATGAGGC 
ATTAAATTAA 
TAAATGGCAA 
AAAATTCCTC 
TTCTCTTGAA 
AGGTCTAGAT 
GTAATGTATT 
ATAGATATGA 
CTCATTTTAG 
TTTTTATATT 
GGTTTTCTAT 
TACCATTATG 
GTTATAATTA 
AACAATTCTG 
TATTTATGTT 
TATCTTGCAA 
CGTGTTTGTA 
GCTTTAAAGA 
CCTTGATTTC 
TAAATCAGTG 
TGAATTTATC 
TTTAAGAGAT 
ATAAAATTAC 



ACCCCCTCTC 
GGCAGCTACA 
TTTGAAAATC 
CCGAGCTACG 
GTCTACGAGC 
ACGTCGGAGC 
AATGGGGCCC 
CAGTTATAAG 
AAGTATGATG 
AATGGAATTA 
AAAACGCCAG 
GTGGTAAATA 
TTTAATCTTA 
TTTTTACTGA 
GTGGTCCTTA 
GAATTAAAGT 
TGGAAATTGT 
GCTGTCTACA 
ACACCGTGGT 
CTAGATTAAG 
ATAGTTTCCT 
ACAAATGACT 
TAGCTTAAGT 
AAGGACTTAT 
ACGACTCTCC 
GGAGGATTGC 
AGGGATGCTA 
TTATATCTTT 
ACTGTATTTT 
TTTAACCATG 
CTCTTAATGA 
AATAAATGTT 
TGTAAAAAAA 
TGCCAAATAC 
CCATATTGAC 
TAGTGAAAGT 
GTTTTGCAGT 
ACCACATTCA 
AGTAAAACCT 
CTTTTCCTGT 
TTATTTGCTC 
TTGAATTTAT 
TCTGTTAGCC 
CAGATTAATC 



CAGTGGAGTA 
GCAGTGGGAT 
AGGTGAATGA 
GCGGGGAGGA 
GGACTCTGCG 
CGCCCGCGCT 
GGAGGACTGA 
TCACTTTTCT 
AGAAGACATT 
AAACTTGCTT 
ATGTTAAATG 
GTGATAAAGA 
AAACCAATAC 
ATACTTGATA 
TTTTAGAAGA 
TCAAGCAGGC 
TTTGTTCTTG 
TTCGTAAAAT 
ATCTACTGTA 
CAAGAGGCTC 
ATATAAAATT 
GTGCCCCATT 
GAAGAAAAAA 
AAAAGGCAAG 
ACTTTTACCA 
TGTAGACTTC 
ATGCCAGTTG 
GTTTTTTCTT 
AAATCATACT 
TTTGTTTTGT 
AACCATTCAG 
CAAGTATTTT 
AGTAAAAATA 
TTTAOGTATG 
TTTACAATTT 
GTTCATAATT 
CAGAAATTTT 
TATATATATA 
TCCATTTAAA 
GTAATTTAAT 
TTGGAGCCAT 
CATGTGTGTG 
CACATGTTCA 
TT 



CCACTCGCCA 
TTCTTCTCTC 
ACAGTOGGCC 
GCCAGTGCGC 
GCGCCCCGTC 
GCCCCCCAAG 
CCCCGGCCCG 
ATGTACCTGC 
TAGTGTAGGC 
ATTAAATATC 
AAGTATGGCT 
CTCCTTTTGT 
GATATTGTCA 
CTTGGAGAAA 
CTTCTGTAAA 
GAGCAAGACA 
AAACAAAATA 
TCTATTTTGT 
TTTCTTTTCA 
CAGAAGAAAT 
TGTCATTGAA 
AGCAAGAATT 
AAAAACTTAG 
GGCATTAACT 
GTGGAGTTTG 
TCTAGCTTGA 
TAGAAGTGTG 
TTACTGACTG 
GTTAAATATT 
TGGTAGCTTA 
CAGGTATATG 
TGTATATAAC 
AACGCAAACA 
GAAAAAGAAT 
TGAATGTCGG 
GAGAAAAGGA 
TTGAACTAGC 
CATATATATG 
TAAGATGACA 
AGATTTGTTG 
TTTTTAAAAA 
TATTTCTGAA 
TGTTGGTTGC 



GGACTCATCT 
AGCCGGTGCA 
CCCCTGCCGG 
AAGGAGAGCA 
CCGCTACCTC 
CCTCTGGCAG 
CGGCCCAGGC 
GATGCATTCT 
ACTTTAATAA 
ATGTTGCACA 
GAATTTCATT 
ACCTTTTTAT 
AACGATACAA 
GCTTATTAAG 
TAAGGCAAGG 
GTATACTTAA 
CTTCTTTAAG 
GAATTGGTAG 
AGGTGCAATT 
GTTTACTTGA 
CAAGAGCAAA 
CAGGAATCAA 
TGAAAATGTA 
TTCAGTCCTG 
TCTTAGCTGA 
ATATTGCAAC 
AAAAAAGCAC 
TTTATAACAC 
TTCCCTCTTT 
CCTGGAAGGC 
CTGTTGAGGC 
TGG^TAATTT 
GTTGTTGATG 
ATTTGTACAT 
AAAAATTAAT 
ACATATGCAT 
TTTTGCTTTT 
TGAAGCTCCA 
TGCATAAGAT 
ACTAGTGCTT 
AAATTTTGGC 
GCAGCTACAT 
TGCTGAATGG 



2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
40B0 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 



Seq ID NO: 61 Protein sequence: 
Protein Accession #: NP 055520 



11 



21 



31 



41 



51 



MAGKWRFINC 
LLYDELLEWS 
ESYYDYRNIiS 
RLEAFQQRML 
XKSFYKVNHI 
VEMSPLENAI 
FPVKEYILSH 
KSSLGIQSFS 
SSLSSQASAE 
SDKHKHSREN 
RHTTSVSPSP 
TSGFENQVNB 
IPVTSEPPAL 



YCNSSNGEW 
DRPLREFLTY 
KMRMMEASLY 
NEFPHAIAMQ 
WKFRYDRPFH 
EVLBNKNQQL 
PEDGEKIARL 
ACMQASPVHF 
VSNITGQSES 
SCLSPRERPC 
AGRSPLKGSV 
QSAPLPVPVP 
PPKPLAARSS 



RIjQNFYKTBL 
PMOTEWQRKE 
DKIMDQQRLE 
HANQPDETIF 
XGTKDKENEF 
KTLISQCQTR 
RELMLEQAQI 
PNGSPRVCRN 
SDEVFNMQPS 
SAIYPTPVEP 
QSFTPSPVEY 
VPVPSYGGEE 
HLKNGARRTD 



NKEEMYIRYI 
HLHLTIIQNF 
PEFFRVGFYG 
OAEAQYIiQIY 
KSLWVERTSL 
QMQNTNPLTM 
LEFGLAVHEK 
SAPASVSPDG 
PSTSSLSSTH 
SQRMLFNHIG 
HSPGLISNSP 
PVRKESKTPP 
PGPRPRPLPR 



HKLYDTiHTiKA 
DRGKCWENGI 
KKFPFFLRNK 
AVTPIPESQE 
YLVQSLPGIS 
CIiNGVIDAAV 
FVPQDMRPLH 
TRVIPRRSPIi 
SASPNVTSSA 
DGALPRSDPN 
VLSGSYSSGI 
PYSVYERTLR 
KVSQL 



QNFTEAAYTL 
IIiCRKIAEQY 
EFVCRGHDYE 
VLQREGVPDN 
RWFEVEKREV 
NGGVSRYQEA 
KKLVDQFFVM 
SYPAVNRYSS 
PS SARAS PLL 
LSAPEKASPA 
SSLSRCSTSE 
RPVPLPHSLS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



Seq ID NO: 62 Nucleotide sequence: 

Nucleic Acid Accession #: fgenesh prediction 

Coding sequence: 1..2561 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGACCGAG GCCAGGGTAA GAGGGGCCGC GACGCCCGCA CTTGTTGCGG CGCCGGGCGG 60 

GAAAGGGAGA CTGGACGATC TGAAGCCGGA GAGGAGGAGG GAGAGAGGCG GGCGGTGGGG 120 

CGGGGGCTGA GGAACGCTCG GAGGGGACTG GGAGACGCGG CGCTTATGCA AAGGTGCCTT 180 

CGGCTGCCGG GACAACCCGC CAGCAACCAG GTACAGCTCT CAGAGGTTCC ACAGAGGAAG 240 

CTCAGGGTCC CTGAATCTCC CAGTGTGGCA GAGAAAGTGA AACTTGGTCA CCGATGCCTG 300 
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GAACTGCTGG AGCAGCTGCT CCCAGAGCTC ACCGGGCTGC TCAGCCTCCT GGACCACGAG 360 

TACCTCAGCG ATACCACCCT GGAAAAGAAG ATGGCCGTGG CCTOCATCCT GCAGAGCCTG 420 

CAGCCCCTTC CAGCAAAGGA GGTCTCCTAC CTGTATGTGA ACACAGCAGA CCTCCACTCG 480 

GGGCCCAGCT TCGTGGAATC CCTCTTTGAA GAATTTGACT GTGACCTGAG TGACCTTCGG 540 

GACATGCCAG AGGATGATGG GGAGCCCAGC AAAGGAGCCA GCCCTGAGCT AGCCAAGAGC 600 

CCACGCCTGA GAAACGCGGC OGACCTGCCT CCACCGCTCC CCAACAAGCC TCCCCCTGAG 660 

GACTACTATG AAGAGGCCCT TCCTCTGGGA CCCGGCAAGT CGCCTGAGTA CATCAGCTCC 720 

CACAATGGCT GCAGCCCCTC ACACTCGATT GTGGATGGCT ACTATGAGGA CGCAGACAGC 780 

AGCTACCCTG CAACCAGGGT GAACGGCGAG CTTAAGAGCT CCTATAATGA CTCTGACGCA 840 

ATGAGCAGCT CCTATGAGTC CTACGATGAA GAGGAGGAGG AAGGGAAGAG CCCGCAGCCC 900 

CGACACCAGT- GGCCCTCAGA GGAGGCCTCC ATGCACCTGG TGAGGGAATG CAGGATATGT 960 

GCCTTCCTGC TGCGGAAAAA GCGTTTCGGG CAGTGGGCCA AGCAGCTGAC GGTCATCAGG 1020 

GAGGACCAGC TCCTGTGTTA CAAAAGCTCC AAGGATCGGC AGCCACATCT GAGGTTGGCA 1080 

CTGGATACCT GCAGCATCAT CTACGTGCCC AAGGACAGCC GGCACAAGAG GCACGAGCTG 1140 

CGTTTCACCC AGGGGGCTAC CGAGGTCTTG GTGCTGGCAC TGCAGAGCCG AGAGCAGGCC 1200 

GAGGAGTGGC TGAAGGTCAT CCGAGAAGTG AGCAAGCCAG TTGGGGGAGC TGAGGGAGTG 1260 

GAGGTCCCCA GATCCCCAGT CCTCCTGTGC AAGTTGGACC TGGACAAGAG GCTGTCCCAA 1320 

GAGAAGCAGA CCTCAGATTC TGACAGCGTG GGTGTGGGTG ACAACTGTTC TACCCTTGGC 1380 

CGCCGGGAGA CCTGTGATCA CGGCAAAGGG AAGAAGAGCA GCCTGGCAGA ACTGAAGGGC 1440 

TCAATGAGCA GGGCTGCGGG CCGCAAGATC ACCCGTATCA TTGGCTTCTC CAAGAAGAAG 1500 

ACACTGGCCG ATGACCTGCA GACGTCCTCC ACCGAGGAGG AGGTTCCCTG CTGTGGCTAC 1560 

CTGAACGTGC TGGTGAACCA GGGCTGGAAG GAACGCTGGT GCCGCCTGAA GTGCAACACT 1620 

CTGTATTTCC ACAAGGATCA CATGGACCTG CGAACCCATG TGAACGCCAT CGCCCTGCAA 1680 

GGCTGTGAGG TGGCCCCGGG CTTTGGGCCC CGACACCCAT TTGCCTTCAG GATCCTGCGC 1740 

AACCGGCAGG AGGTGGCCAT CTTGGAGGCA AGCTGTTCAG AGGACATGGG TCGCTGGCTC 1800 

GGGCTGCTGC TGGTGGAGAT GGGCTCCAGA GTCACTCCGG AGGCGCTGCA CTATGACTAC 1860 

GTGGATGTGG AGACCTTAAC CAGCATCGTC AGTGCTGGGC GCAACTCCTT CCTATATGCA 1920 

AGATCCTGCC AGAATCAGTG GCCTGAGCCC CGAGTCTATG ATGATGTTCC TTATGAAAAG 1980 

ATGCAGGACG AGGAGCCCGA GCGCCCCACA GGGGCCCAGG TGAAGCGTCA CGCCTCCTCC 2040 

TGCAGTGAGA AGTCCCATCG TGTGGACCCG CAGGTCAAAG TCAAACGCCA CGCCTCCAGT 2100 

GCCAATCAAT ACAAGTATGG CAAGAACCGA GCCGAGGAGG ATGCCCGGAG GTACTTGGTA 2160 

GAAAAAGAGA AGCTGGAGAA AGAGAAAGAG ACGATTCGGA CAGAGCTGAT AGCACTGAGA 2220 

CAGGAGAAGA GGGAACTGAA GGAAGCCATT CGGAGCAGCC CAGGAGCAAA ATTAAAGGCT 2280 

CTGGAAGAAG CCGTGGCCAC CCTGGAAGCT CAGTGTCGGG CAAAGGAGGA GCGCCGGATT 2340 

GACCTGGAGC TGAAGCTGGT GGCTGTGAAG GAGCGCTTGC AGCAGTCCCT GGCAGGAGGG 2400 

CCAGCCCTGG GGCTCTCCGT GAGCAGCAAG CCCAAGAGTG GGCAACTCTC TGAGGAAGAT 2460 

ACGCTCACCT CCAATGGTGC TCTCTCAGAG AGAACTTCTC TGACCTCATC TACACCAGGG 2520 
CTTCTCAACC CCAACACTAC TGACATTTTG GACCA GTAA 



Seg ID NO: 63 Protein sequence; 

Protein Accession #: fgenesh prediction 



1 11 21 31 • 41 51 

I I I I I I 

MDRGQGKRGR DARTCCGAGR ERETGRSEAG BEEGERRAVG RGLRNARRGL GDAALMQRCL 60 

RLPGQPASNQ VQLSEVPQRK LRVPESPSVA EKVKLGHRCL ELLEQLLPEL TGLLSLLDHE 120 

YLSDTTLEKK MAVASILQSL QPLPAKEVSY LYVNTADLHS GPSFVESLFE EFDCDLSDLR 180 

DMPEDDGEPS KGASPELAKS PRLRNAADLP PPLPNKPPPE DYYEEALPLG PGKSPEYISS 240 

HNGCSPSHSI VDGYYEDADS SYPATRVNGE LKSSYNDSDA MSSSYESYDE EEEEGKSPQP 300 

RHQWPSEEAS MHLVRECRIC AFLLRKKRFG QWAKQLTVIR EDQLLCYKSS KDRQPHLRLA 360 

LDTCSIIYVP KDSRHKRHEL RFTQGATEVL VLALQSREQA EEWLKVTREV SKPVGGAEGV 420 

EVPRSPVLLC KLDLDKRLSQ EKQTSDSDSV GVGDNCSTLG RRETCDHGKG KKSSLAELKG 480 

SMSRAAGRKI TRIIGFSKKK TLADDLQTSS TEEEVPCCGY LNVLVNQGWK ERWCRLKCNT 540 " 

LYFHKDHMDIi RTHVNAIALQ GCEVAPGFGP RHPFAFRILR NRQEVAILEA SCSEDMGRWL 600 

GT1T1T1VEMGSR VTPEALHYDY VDVETLTSIV SAGRNSFLYA RSOQNQWPEP RVYDDVPYEK 660 

MQDEEPERPT GAQVKRHASS CSEKSHRVDP QVRVKRHASS ANQYKYGKNR AEEDARRYLV 720 

EKEKLEKEKE TIRTELIALR QEKRELKEAI RSSPGAKLKA LEEAVATLEA QCRAKEERRI 780 

DLELKLVAVK ERLQQSLAGG PALGLSVSSK PKSGQLSEED TLTSNGALSE RTSLTSSTPG 840 
LIOTNTTDIL DQ 

Seq ID NO: 64 Nucleotide sequence: 
Nucleic Acid Accession #: NM_004126.1 

Coding sequence: 108-129 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GGCACGAGCT CGTGCCGGCC TTCAGTTGTT TCGGGACGCG CCGAGCTTOG CCGCTCTTCC 60 

AGCGGCTCCG CTGCCAGAGC TAGCCCGAGC CCGGTTCTGG GGCGAAAATG CCTG(XCTTC 120 

ACATCGAAGA TTTGCCAGAG AAGGAAAAAC TGAAAATGGA AGTTGAGCAG CTTCGCAAAG 180 

AAGTGAAGTT GCAGAGACAA CAAGTGTCTA AATGTTCTGA AGAAATAAAG AACTATATTG 240 

AAGAACGTTC TGGAGAGGAT CCTCTAGTAA AGGGAATTCC AGAAGACAAG AACCCCTTTA 300 

AAGAAAAAGG CAGCTGTGTT ATTTCATAAA TAACTTGGGA GAAACTGCAT CCTAAGTGGA 360 

AGAACTAGTT TQTTTTAGTT TTCCCAGATA AAACCAACAT GCTTTTTAAG GAAGGAAGAA 420 
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TGAAATTAAA AGGAGACTTT CTTAAGCACC ATATAGATAG GGTTATGTAT AAAAGCATAT 480 

GTGCTACTCA TCTTTGCTCA CTATGCAGTC TTTTTTAAGA GAGCAGAGAG TATCAGATGT 540 

ACAATTATGG AAATAAGAAC ATTACTTGAG CATGACACTT CTTTCAGTAT ATTGCTTGAT 600 
GCTTCAAATA AAGTTTTGTC TT 



Seq ID NO: 65 Protein sequence; 
Protein Accession #: NP 004117 



1 11 * 21 31 41 51 

I I I I I I 

MPAliHIEDLP EKEKLKMHVB QKRKEVKLQR QQVSKCSEEI KNYIBERSGE DPLVKGIPED 60 
KNPFKEKGSC VIS 



Seq ID NO: 66 Nucleotide sequence; 
Nucleic Acid Accession #: NM_003842.I 

Coding sequence: 1-1236 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGAACAAC GGGGACAGAA CGCCCCGGCC GCTTCGGGGG CCCGGAAAAG GCACGGCCCA 60 

6GACCCAGGG AGGCGCGGGG AGCCAGGCCT GGGCCCCGGG TCCCCAAGAC CCTTGTGCTC 120 

GTTGTCGCCG CGGTCCTGCT GTTGGTCTCA GCTGAGTCTG CTCTGATCAC CCAACAAGAC 180 

CTAGCTCCCC AGCAGAGAGC GGCCCCACAA CAAAAGAGGT CCAGCCCCTC AGAGGGATTG 240 

TGTCCACCTG GACACCATAT CTCAGAAGAC GGTAGAGATT GCATCTCCTG CAAATATGGA 300 

CAGGACTATA GCACTCACTG GAATGACCTC CTTTTCTGCT TGCGCTGCAC CAGGTGTGAT 360 

TCAGGTGAAG TGGAGCTAAG TCCCTGCACC ACGACCAGAA ACACAGTGTG TCAGTGCGAA 420 

GAAGGCACCT TCCGGGAAGA AGATTCTCCT GAGATGTGCC GGAAGTGCCG CACAGGGTGT 480 

CCCAGAGGGA TGGTCAAGGT CGGTGATTGT ACACCCTGGA GTGACATCGA ATGTGTCCAC 540 

AAAGAATCAG GCATCATCAT AGGAGTCACA GTTGCAGCCG TAGTCTTGAT TGTGGCTGTG 600 

TTTGTTTGCA AGTCTTTACT GTGGAAGAAA GTCCTTCCTT ACCTGAAAGG CATCTGCTCA 660 

GGTGGTGGTG GGGACCCTGA GCGTGTGGAC AGAAGCTCAC AACGACCTGG GGCTGAGGAC 720 

AATGTCCTCA ATGAGATCGT GAGTATCTTG CAGCCCACCC AGGTCCCTGA GCAGGAAATG 780 

GAAGTCCAGG AGCCAGCAGA GCCAACAGGT GTCAACATGT TGTCCCCCGG GGAGTCAGAG 840 

CATCTGCTGG AACCGGCAGA AGCTGAAAGG TCTCAGAGGA GGAGGCTGCT GGTTCCAGCA 900 

AATGAAGGTG ATCCCACTGA GACTCTGAGA CAGTGCTTCG ATGACTTTGC AGACTTGGTG 960 

CCCTTTGACT CCTGGGAGCC GCTCATGAGG AAGTTGGGCC TCATGGACAA TGAGATAAAG 1020 

GTGGCTAAAG CTGAGGCAGC GGGCCACAGG GACACCTTGT ACACGATGCT GATAAAGTGG 1080 

GTCAACAAAA CCGGGCGAGA TGCCTCTGTC CACACCCTGC TGGATGCCTT GGAGACGCTG 1140 

GGAGAGAGAC TTGCCAAGCA GAAGATTGAG GACCACTTGT TGAGCTCTGG AAAGTTCATG 1200 
TATCTAGAAG GTAATGCAGA CTCTGCCATG TCCTAA 



Seq ID NO: 67 Protein sequence: 
Protein Accession #: NP_003833.1 

1 11 21 31 41 51 

I I I I I I 

MEQRGQNAPA ASGARKRHGP GPREARGARP GPRVPKTLVL WAAVLLLVS AESALITQQD 60 
LAPQORAAPQ QKRSSPSEGL CPPGHHISED GRDCISCKYG QDYSTHWNDL LFCLRCTRCD 120 
SGEVELSPCT TTRNTVCQCE EGTFREEDSP BMCRKCRTGC PRGMVKVGDC TPWSDIECVH 180 
KESGIIIGVT VAAWLIVAV FVCKSLLWKK VLPYLKGICS GGGGDPERVD RSSQRPGAED 24 0 
NVLNEIVSIL QPTQVPEQEM EVQEPAEPTG VNMLSPGESE HLLEPAEAER SQRRRLLVPA 300 
NEGDPTETLR QCFDDFADLV PFDSWEPLMR KLGLMDNEIK VAKAEAAGHR DTLYTMLIKW 360 
VNKTGRDASV HTLLDALETL GERLAKQKIE DHLLSSGKFM YLEGNADSAM S 



Seq ID NO: 68 Nucleotide sequence: 

Nucleic Acid Accession #: PGENESH predicted ORF 

Coding sequence: 361- 2220 (underlined sequences correspond to start and stop codons) 



1 11 21 

i I I 

GGCACCATCT GCTCCCTGCC CTGCCCAGAG 
TGTCGCTGCC ACAACGGCGG CCTCTGTGAC 
GGTTACACTG GGGATCGGTG CCGGGAGGAG 
GCTGAGACGT GCGACTGOGC CCCGGACGCC 
TGCGAACACG GCTTCACTGG GGACCGCTGC 
GGTCTCAGCT GCCAGGCCCC CTGCACCTGC 
ATGAACGGGG AGTGCTCCTG CCTGCCGGGC 
CCGCAGGACA CGCATGGGCC AGGGTGCCAG 
TGCCAGGCTA CCAGCGGCCT CTGTCAGTGC 
AGTCTTTGTC CTCCTGACAC CTACGGTGTC 
GCCATCGCCT GCTCACCCAT CGACGGCGAG 
AACTGCTCTG TGCCCTGCCC ACCCGGAACC 
TGTGCCCATG AGGCAGTCTG CAGCCCCCAA 



31 41 51 

I I I 

GGCTTTCACG GACCCAACTG CTCCCAGGAA 60 

CGATTCACTG GGCAGTGCCG CTGCGCTCCG 120 

TGCCCGGTGG GCCGCTTTGG GCAGGACTGT 180 

CGTTGCTTCC CGGCCAACGG CGCATGTCTG 240 

ACGGATCGCC TCTGCCCCGA CGGCTTCTAC 300 

GACCGGGAGC ACAGCCTCAG CTGCCACCCG 360 

TGGGCGGGCC TCCACTGCAA CGAGAGCTGC 420 

GAGCACTGTC TCTGCCTGCA CGGTGGCGTC 480 

GCGCCGGGTT ACACGGGCCC TCACTGTGCT 540 

AACTGTTCTG CACGCTGCTC ATGTGAAAAT 600 

TGCGTCTGCA AGGAAGGTTG GCAGCGTGGT 660 

TGGGGCTTCA GTTGCAATGC CAGCTGCCAG 720 

ACTGGAGCCT GTACCTGCAC CCCTGGGTGG 780 
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25 



CATGGGGCCC 
CGCTGTGACT 
GCTGGCTGGA 
TGTAGCAACA 
GTGTGTGCAC 
GGCAAACGCT 
ACCTGCTACT 
ACATTTGGTG 
GAGACTGGGG 
CAGGAGCCCT 
ATTGGCATTG 
CGGCACTGGC 
CTGGACGGCT 
AACCCCAGCT 
CCAGGCCCGC 
GATAACCACA 
CTGGACAGGG 
GGCCCATTCT 
CTGAGCAGTG 
CGGGAGAGCA 
CCTCAGTTCT 
TACGAGCAGC 
CCTCCGGGCC 
TATGACTTGC 



ACTGCCAGCT 
GTGACCACTC 
TGGGTGCCCG 
CCTGCACCTG 
CCGGATTCCG 
GTGTGCCCTG 
GCCTGGCTGG 
CTAACTGCTC 
CCTGTGTATG 
TTACTGTGAT 
CAGTGCTGGG 
AAAAAGGCAA 
CCGAGTATGT 
ACCACACCCT 
TCTTTGCCAG 
CCACCCTGCC 
GGAGCAGCCG 
ACAATAAAGG 
AGAACCCATA 
GCTACATGGA 
GGGACAGCCA 
CCAGCCCCCT 
TACCCCCCGG 
CTCCAGTACG 



GCCCTGTCCG 
TGATGGCTGT 
CTGCCACCTG 
CAAGAATGGG 
GGGCCCCTCC 
CAAGTGCGCT 
CTGGACAGGC 
CCAGCCATGC 
TCCCCCAGGG 
GCCGACCACT 
GTCCCTTGTG 
GGAGCACCAC 
CATGCCAGAT 
GTCGCAGTGC 
CCTGCAGAAC 
TGCTGACTGG 
CCTGGACCGA 
GCTCATCTCT 
TGCCACCATC 
GATGAAAGGC 
GAGGCGGCGG 
GATCCATGAC 
CCACTATGAC 
GCATCCCCCA 



AAGGGGCAGT 
GACCCTGTTC 
TCCTGCCCTG 
GGCACCTGTC 
TGCCAGAGAT 
AACCACTCCT 
CCCGACTGCT 
CAGTGTGGTC 
CACAGTGGTG 
CCAGTAGCGT 
GTAGCCCTGG 
CACCTGGCTG 
GTCCCTCCGA 
TCCCCAAACC 
CCTGAGCGGC 
AAGCACCGCC 
AGCTACAGCT 
GAAGAGGAGC 
CGGGACCTGC 
CCTCCCTCAG 
CAACCCCAGC 
CGAGACTCTG 
TCACCCAAGA 
TCACCTCCAC 



TTGGAGAAGG 
ATGGACGCTG 
AGGGCTTATG 
TCCCTGAGAA 
CCTGTCAGCC 
TCTGCCACCC 
CCCAGCGCTG 
CTGGAGAAAA 
CACCTTGCAG 
ATAACTCGCT 
TGGCACTGTT 
TGGCTTACAG 
GCTACAGTCA 
CCCCACCCCC 
CAGGTGGGGC 
GGGAGCCCCC 
ATAGCTACAG 
TCGGGGCCAG 
CCAGCTTGCC 
GATCTCCCCC 
CACAGAGAGA 
TGGGCTCCCA 
ACAGCCACAT 
TTCGACGCCA 



TTGTGCCAGT 
TCAGTGCCAG 
GGGAGTCAAC 
TGGCAACTGC 
TGGCCGCTAT 
CTCGAACGGG 
CCCTCTGGGG 
GTGCCACCCA 
GATTGGAATC 
GGGTGCAGTG 
CATTGGCTAT 
CAGCGGGCGC 
CTACTACTCC 
TAACAAGGTT 
CCAAGGGCAT 
TCCAGGGCCT 
CAATGGCCCA 
TGTGGCTTCC 
AGGGGGCCCC 
CAGGCAGCCT 
CAGTGGCACC 
GCCCCCTCTG 
CCCTGGACAT 
GGACOGTTGA 



840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



Seq ID NO: 69 Protein sequence: 

Protein Accession #: FGENESH prediction 
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40 



45 



GTICSLPCPB 
AETCDCAPDA 
MNGECSCLPG 
SLCPPDTYGV 
CAHBAVCSPQ 
AGWMGARCHL 
GKRCVPCKCA 
ETGACVCPPG 
RHWQKGKEHH 
PGPLFASLQN 
GPFYNKGLIS 
PQFWDSQRRR 
YDLPPVRHPP 



11 
I 

GPHGPNCSQE 
RCFPANGACL 
WAGLHCNESC 
NCSARCSCEN 
TGACTCTPGW 
SCPEGLWGVN 
NHSFCHPSNG 
HSGAPCRIGI 
HIAVAYSSGR 
PERPGGAQGH 
EEELGASVAS 
QPQPQRDSGT 
SPPLRRQDR 



21 
I 

CRCHNGGLCD 
CEHGFTGDRC 
PQDTHGPGCQ 
AIACSPIDGE 
HGAHCQLPCP 
CSNTCTCKNG 
TCYCLAGWTG 
QEPFTVMPTT 
LDGSEYVMPD 
DNHTTLPADW 
LSSENPYATI 
YEQPSPLIHD 



31 
I 

RFTGQCRCAP 
TDRLCPDGFY 
EHCLCLHGGV 
CVCKEGWQRG 
KGQPGEGCAS 
GTCLPENGNC 
PDCSQRCPLG 
PVAYNSLGAV 
VPPSYSHYYS 
KHRREPPPGP 
RDXiPSLPGGP 
KDSVGSQPPh 



41 
I 

GYTGDRCREE 
GLSCQAPCTC 
CQATSGLCQC 
NCSVPCPPGT 
RCDCDHSDGC 
VCAPG FRCPS 
TFGANCSQPC 
IGIAVLGSLV 
NPSYHTLSQC 
LDRGSSRLDR 
RESSYMEMKG 
PPGLPPGHYD 



51 
I 

CPVGRFGQDC 
DREHSLSCHP 
APGYTGPHCA 
WGFSCNASCQ 
DPVHGRCQCQ 
CQRSCQPGRY 
QCGPGEKCHP 
VALVALFIGY 
SPNPPPPNKV 
SYSYSYSNGP 
PPSGSPPRQP 
SPKNSHIPGH 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 



50 



Seq ID NO: 70 Nucleotide sequence: 
Nucleic Acid Accession #: NM_005458 

Coding sequence: 1..2826 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



55 



60 



65 



70 



75 



ATGGCTTCCC 
GCGCGCCTGC 
GGCTGGGCGC 
CTCATGCCGC 
GTGGAACTGG 
CTGCGGCTCT 
ATAAAATACG 
ATCATTGCAG 
CCTGTTCTAG 
GOGGTGAATC 
CTGACGCAAG 
GGCGAGGACA 
AAAAAGCTGA 
GCAAAAGTGT 
ATTCOGGGCT 
CGCTGCCTCC 
CCCCTGAGCT 
GAGTACAACA 
GGCATCTGGG 
CGGCACCAGC 
AATGCCATGA 
GAGAGAATGG 
GAGTACAACG 
TCCGAACCAC 



CGCGGAGGTC 
TACTGCTACT 
GGGGCGCCCC 
TCACCAAGGA 
CCATCGAGCA 
ATGACACGGA 
GGCCGAACCA 
AGTCCCTCCA 
CCGATAAGAA 
CAGCCATTCT 
ACGTTCAGAG 
TTGAGATTTC 
AGGGGAATGA 
TCTGTTGTGC 
GGTACGAGCC 
GGAAGAATCT 
CCAAGCAGAT 
ACAAGCGGTC 
TCATCGCCAA 
GGATCCAGGA 
ACGAGACCAA 
GGACCATTAA 
CTGTGGCCGA 
CAAAAGACAA 



CGGGCAGCCA 
GCTGCTGCCG 
CCGGCCGCCG 
GGTGGCCAAG 
GATCCGCAAC 
GTGCGACAAC 
CTTGATGGTG 
AGGCTGGAAT 
AAAATACCCT 
GAAGTTGCTC 
GTTCTCTGAG 
AGACACCGAG 
TGTGCGGATC 
ATACGAGGAG 
TTCTTGGTGG 
GCTTGCTGCC 
CAAGACCATC 
AGGCGTGGGG 
GACACTGCAG 
CTTCAACTAC 
CTTCTTCGGG 
ATTTACTCAA 
CACACTGGAG 
GACCATCATC 



GGGCGGCOGC 
CTGCTGCTGC 
CCCAGCAGCC 
GGCAGCATCG 
GAGTCACTCC 
GCAAAAGGGT 
TTTGGAGGCG 
CTGGTGCAGC 
TATTTCTTTC 
AAGCACTACC 
GTGCGGAATG 
AGCTTCTCCA 
ATCCTTGGCC 
AACATGTATG 
GAGCAGGTGC 
ATGGAGGGCT 
TCAGGAAAGA 
CCCAGCAAGT 
AGGGCCATGG 
ACGGACCACA 
GTCACGGGTC 
TTTCAAGACA 
ATCATCAATG 
CTGGAGCAGC 



CGCCGCCGCC 
CTCTGGCGCC 
CGCCGCTCTC 
GGCGCGGTGT 
TGCGCCCCTA 
TGAAAGCCTT 
TCTGTCCATC 
TTTCTTTTGC 
GGACCGTCCC 
AGTGGAAGCG 
ACCTGACTGG 
ACGATCCCTG 
AGTTTGACCA 
GTAGTAAATA 
ACACGGAAGC 
ACATTGGCGT 
CTCCACAGCA 
TCCACGGGTA 
AGACACTGCA 
CGCTGGGCAG 
AAGTTGTATT 
GCAGGGAGGT 
ACACCATCAG 
TGCGGAAGAT 



ACCGCCGCCC 
CGGGGCCTGG 
CATCATGGGC 
GCTCCCCGCC 
CTTCCTCGAC 
CTACGATGCA 
CGTCACATCC 
TGCAACCACG 
ATCAGACAAT 
CGTGGGCACG 
AGTTCTGTAT 
TACCAGTGTC 
GAATATGGCA 
TCAGTGGATC 
CAACTCATCC 
GGATTTCGAG 
GTATGAGAGA 
CGCCTACGAT 
TGCCAGCAGC 
GATCATCCTC 
CCGGAATGGG 
GAAGGTGGGA 
GTTCCAAGGA 
CTCCCTACCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
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CTCTACAGCA 
TTCTTCAACA 
AACCTTATCA 
GGATCCTTTG 
ACCGTGGGCT 
ATCTTCAAAA 
GTGGGGGGCA 
CT6CGAAGGA 
ATCCGCCCTC 
TATGCCTACA 
GTCAGCATCC 
ATCATGTGCA 
TTCTGCATCG 
GTGCCGAAGC 
CAGTTCACTC 
AACCAAGCCA 
AAGATCACAG 
GAAAAGACCA 
GGAAACTTCA 
AATCCCCAGC 
GATATAAACT 
CACGCCTACC 
CCCACCGCCA 
CT GTAA 



TCCTCTCTGC 
TCAAGAACCG 
TCCTTGGAGG 
TCTCTGAAAA 
ACACGACCGC 
ATGTGAAAAT 
TGCTGCTGAT 
CAGTGGAGAA 
TCCTGGAGCA 
AGGGACTTCT 
CCGCACTCAA 
TCATOGGGGC 
TGGCTCTGGT 
TCATCACCCT 
AGAATCAGAA 
GCACATCCCG 
AGCTGGATAA 
CCTACATTAA 
CTGAGAGCAC 
TACAGTGGAA 
CTCCAGAACA 
TCCCATCCAT 
GCCCCCGCCA 



CCTCACCATC 
GAATCAGAAG 
GATGCTCTCC 
GACCTTTGAA 
TTTTGGGGCC 
GAAGAAGAAG 
CGACCTGTGT 
GTACAGCATG 
CTGTGAGAAC 
CATGTTGTTC 
CGACAGCAAG 
CGCTGTCTCC 
CATCATCTTC 
GAGAACAAAC 
GAAAGAAGAT 
CCTGGAGGGC 
AGACTTGGAA 
ACAGAACCAC 
AGATGGAGGA 
CACAACAGAG 
CATCCAGCGT 
CGGAGGCGTG 
CAGACATGTG 



CTCGGGATGA 
CTCATAAAGA 
TATGCTTCCA 
ACACTTTGCA 
ATGTTTGCAA 
ATCATCAAGG 
ATCCTGATCT 
GAGCOGGACC 
ACCCATATGA 
GGTTGTTTCT 
TACATCGGGA 
TTCCTGACCC 
TGCAGCACCA 
CCAGATGCAG 
TCTAAAACGT 
CTACAGTCAG 
GAGGTCACCA 
TACCAAGAGC 
AAGGCCATTT 
CCCTCTOGAA 
CGGCTGTCCC 
GACGCCAGCT 
CCACCCTCCT 



TCATGGCCAG 
TGTCGAGTCC 
TATTTCTCTT 
CCGTCAGGAC 
AGACCTGGAG 
ACCAGAAACT 
GCTGGCAGGC 
CAGCAGGACG 
CCATCTGGCT 
TAGCTTGGGA 
TGAGTGTCTA 
GGGACCAGCC 
TCACCCTCTG 
CAACGCAGAA 
CCACCTOGGT 
AAAACCATCG 
TGCAGCTGCA 
TCAATGACAT 
TAAAAAATCA 
CATGCAAAGA 
TCCAGCTCCC 
GTGTCAGCCC 
TCCGAGTCAT 



TGCTTTTCTC 
ATACATGAAC 
TGGCCTTGAT 
CTGGATTCTC 
AGTCCACGCC 
GCTTGTGATC 
TGTGGACCCC 
GGATATCTCC 
TGGCATCGTC 
GACCCGCAAC 
CAACGTGGGG 
CAATGTGCAG 
CCTGGTATTC 
CAGGCGATTC 
CACCAGTGTG 
CCTGCGAATG 
GGACACACCA 
CCTCAACCTG 
CCTCGATCAA 
TCCTATAGAA 
CATCCTCCAC 
CTGCGTCAGC 
GGTCTCGGGC 



1500 
1560 
1620 
1680 
1740 
1600 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



Seq ID NO: 71 Protein sequence: 
Protein Accession #: NP 005449 



30 



35 



40 



45 



MASPRRSGQP 
LMPLTKEVAK 
IKYGPNHLMV 
AVNPAILKLL 
KKLKGNDVRI 
RCXiRKNLLAA 
GIWVTAKTLQ 
ERMGTIKFTQ 
LYSILSALTI 
GSFVSEKTFE 
VGGMLLIDLC 
YAYKGLLMLF 
FCIVALVIIF 
NQASTSRLBG 
GNFTESTDGG 
HAYLPSIGGV 



11 

I 

GRPPPPPPPP 
GSIGRGVLPA 
FGGVCPSVTS 
KHYQWKRVGT 
ILGQFDQNMA 
MEGYIGVDFE 
RAMETLHASS 
FQDSREVKVG 
LGMIMASAFL 
TIjCTVRTWIL 
ILICWQAVDP 
GCFLAWETRN 
CSTITLCIiVF 
LQSENHRLRM 
KAILKNHLDQ 
DASCVSPCVS 



21 
I 

ARLLLLLLLP 
VELAIEQIRN 
IIAESLQGWN 
LTQDVQRFSE 
AKVFCCAYEE 
PLSSKQIKTI 
RHQRIQDFNY 
EYNAVADTIiE 
FFNIKNRNQK 
TVGYTTAFGA 
LRRTVEKYSM 
VSIPALNDSK 
VPKLITLRTN 
KITELDKDLE 
NPQLQWNTTE 
PTASPRHRHV 



31 
I 

LTiTiPLAPGAW 
ESLLRPYFLD 
LVQLSFAATT 
VRNDLTGVLY 
NMYGSKYQWI 
SGKTPQQYER 
TDHTLGRIIL 
IINDTIRFQG 
LIKMSSPYMN 
MFAKTWRVHA 
EPDPAGRDIS 
YIGMSVYNVG 
PDAATQNRRF 
EVTMQLQDTP 
PSRTCKDP1E 
PPSFRVMVSG 



41 

I 

GWARGAPRPP 
LRLYDTECDN 
PVLADKKKYP 
GEDIEISDTE 
IPGWYEPSWW 
EYNNKRSGVG 
NAMNETNFFG 
SEPPKDKTII 
NLIILGGMLS 
IFKNVKMKKK 
IRPLLEHCEN 
IMCIIGAAVS 
QFTQNQKKED 
EKTTYIKQNH 
DXNSPEHIQR 
L 



51 
I 

PSSPPLSIMG 
AKGLKAFYDA 
YFFRTVPSDN 
SFSNDPCTSV 
EQVHTEANSS 
PSKFHGYAYD 
VTGQWFRNG 
LEQIoRKISLP 
YASIFLFGLD 
IIKDQKLLVI 
THMTIWLGIV 

fltrdqpnvq 
sktstsvtsv 
yqemjdiz.nl 
rlslqlpilh 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 



Seq ID NO: 72 Nucleotide sequence: 
50 Nucleic Acid Accession #: NM_005795 

Coding sequence: 522-1940 (underlined sequences correspond to start and stop codons) 



55 



60 



65 



70 



75 



i 
I 

GCACGAGGGA 
CAAGCTCTGC 
TTCCCACCTT 
TGAGAATATT 
AAGAAATTCT 
GACAATTGTG 
GAATAATAAA 
AAAGAAAACT 
ACAAGGTTGC 
ATTTGGGCTT 
TTATGATTCT 
TTACTAGAAA 
CCATTCAACA 
ACGATGTTGC 
ATCCATCAGA 
CAAGCAACAG 
AGACTGCACT 
TGCTTATCTC 
TACACAAAAA 
CTGCAGTGGC 
AGTTCATTCA 
ACCTACACAC 



11 
I 

ACAACCTCTC 
TAACTGAATC 
GCTTGTGGGT 
TCACAAAGAA 
TAAAGACAAT 
CATATCGTCT 
AACCCATACT 
ACTACAACTT 
TATAAAACAA 
AATGATGGAG 
TGTTACAGCA 
TAAAATCATG 
AGCAGAAGGC 
AGCAGGAACT 
AAAAGTTACA 
AACATGGACA 
AAA TTTGTTT 
GCTTGGCATA 
TCTGTTCTTC 
CAACAACCAG 
TCTTTACCTG 
ACTCATTGTG 



21 
I 

TCTCTSCAGC 
TCATCCTAAT 
AAATCTCTTC 
TTTCCTTAAG 
GTCAAATATG 
AATAATAAAA 
AGCCTATAGA 
GACAAGACTG 
GATTGCTACA 
AAAAAGTGTA 
GAATTAGAAG 
ACAGCTCAAT 
GTTTACTGCA 
GAATCAATGC 
AAGATCTGTG 
AATTATACCC 
TACCTGACCA 
TTCTTTTATT 
TCATTTGTTT 
GCCTTAGTAG 
ATGGGCTGTA 
GTGGCCGTGT 



31 
I 

AGAGAGTGTC 
TGCAGGATCA 
TGCGGAATCT 
AGCTGGACTG 
ATCCAAGAGA 
ACCCATACTA 
AAACAATATT 
CTGCAAACTT 
ACTTCTAGTT 
CCCTGTATTT 
AGAGTCCTGA 
ATGAATGTTA 
ACAGAACCTG 
AGCTCTGCCC 
ACCAAGATGG 
AGTGTAATGT 
TAATTGGACA 
TCAAGAGCCT 
GTAACTCTGT 
CCACAAATCC 
ATTACTTTTG 
TTGCAGAGAA 



41 
I 

ACCTCCTGCT 
CATTGCAAAG 
CAGAAAGTAA 
GGTCTTGACC 
AAATGTGATT 
GCCTATAGAA 
TGAAAGATTG 
CAATTGGTCA 
TATGT TATAC 
TCTGGTTCTC 
GGACTCAATT 
CCAAAAGATT 
GGATGGATGG 
TGATTACTTT 
AAACTGGTTT 
TAACACCCAC 
CGGATTGTCT 
AAGTTGCCAA 
TGTAACAATC 
TGTTAGTTGC 
GATGCTCTGT 
GCAACATTTA 



51 
I 

TTAGGACCAT 
CTTTCACTCT 
AGTTCCATCC 
CCTGGAATTT 
TGAGTCTGGA 
AACAATATTT 
CTACCACTAA 
CCACAACTTG 
AGCATATTTC 
TTGCCTTTTT 
CAGTTGGGAG 
ATGCAAGACC 
CTCTGCTGGA 
CAGGA CTTTG 
AGACATCCAG 
GAGAAAGTGA 
ATTGCATCAC 
AGGATTACCT 
ATTCACCTCA 
AAAGTGTCCC 
GAAGGCATTT 
ATGTGGTATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



213 
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ATTTTCTTGG CTGGGGATTT CCACTGATTC 
TATATTACAA TGACAATTGC TGGATCAGTT 
GCCCAATTTG TGCTGCTTTA CTGGTGAATC 
TCATCACCAA GTTAAAAGTT ACACACCAAG 
GAGCTACTCT TATCTTGGTG CCATTGCTTG 
CTGAAGGAAA GATTGCAGAG GAGGTATATG 
AGGGTCTTTT GGTCTCTACC ATTTTCTGCT 
GAAGAAACTG GAATCAATAC AAAATCCAAT 
TTCGTAGTGC GTCTTACACA GTGTCAACAA 
GTCCTAGTGA ACACTTAAAT GGAAAAAGCA 
CAGAAAATTT ATATAATTGA AAATAGAAGG 
AACTCAAGGA CTTGGACCCA TGACTCTGTA 
GGGAATGTCA TAAAGAAGAG CCTTCACATG 
ATCCAGCTCT ATGTGGGAAA AAAGAAATCC 
CACTATGCCT GATGTGACGC TACTAACCTG 
ACAATCAACT TTTCTGAGCT GGTGTAAGCC 
AAATGGCTGT AAAACTAAAC ATACATGTTG 
GACCTAGCTA AGGTCTATAA ACATGAAGGG 
TCCCATCTTG ATTGGGGCAG TTGACTTTTT 
TAACTACCCT CTCAAATGGA CAATACCAGA 
CTATGAAAAG CAACTGAGTA CAATTGTTAT 
ATCTTGTGGC ATATCCATTG TGGAAACTGG 
TTCTATATCA TTAGGAAAAC ATCTTAGTTG 
TGTCTTACCA AACAGTGGGA GGGAATTCCT 
TTCTACTGTA TAAACAAATT AGCAATCATT 
TATTTTCTTG GAATTTTGTA AAAAGAAATT 
TTTTATTTTA TAGTCTCAAA TCAAATACAT 
TAATGCAACA ATGTGTGTAT GTTAATATCT 
AAATAGAGTC TGGAATGCTA TATTTGGTAA 
AGAAGTCTGT TTGAGAACTA AGAGAACAGA 
AAACACAAGG TCACTATTTT ACTGAATATA 
GGTGTGTTTG ACATATTTCT TTTTTCATTT 
TTTTAAACAA CTACTGTGAT AAATACCAAT 
AATATTACTT TACTGACTTT TACTATGTGA 
ATTCAAGAAA TATAAAAAAC TAGAAGGATA 
TTTAATAGAG CTACTGTATA TAATACAAAT 
AAAATTATTG TCAGATCTTA CTGAATTATT 
AACCTTGCTA ATGAATTAAA GTGAAATTTG 
CCGCTGAAAT CTCTAAAGAA CAAGAATGAC 
GTCATGGGTA TCTGTTTTTT AAGTGTGTCA 
CATCATAAGT TGTTTCTTAA GCTGTCAATA 
TCAAATTGCT AAGACAAATT ATCTAAATTC 
AGTACATTTA TAATTTATCT ATGCATGAAA 
ATAGCAAGCT GC CAT AG AAA GGA 



CTGCTTGTAT ACATGCCATT GCTAGAAGCT 1380 

CTGATACCCA TCTCCTCTAC ATTATCCATG 1440 

TTTTTTTCTT GTTAAATATT GTACGCGTTC 1500 

CGGAATCCAA TCTGTACATG AAAGCTGTGA 1560 

GCATTGAATT TGTGCTGATT CCATGGCGAC 1620 

ACTACATCAT GCACATCCTT ATGCACTTCC 1680 

TCTTTAATGG AGAGGTTCAA GCAATTCTGA 1740 

TTGGAAACAG CTTTTCCAAC TCAGAAGCTC 1800 

TCAGTGATGG TCCAGGTTAT AGTCATGACT 1860 

TCCATGATAT TGAAAATGTT CTCTTAAAAC 1920 

ATGGTTGTCT CACTGTTTGG TGCTTCTCCT 1980 

GCCAGAAGAC TTCAATATTA AATGACTTTG 2040 

AAATTAGTAG TGTGTTGATA AGAGTGTAAC 2100 

TGGTTTGTAA TGTTTGTCAG TAAATACTCC 2160 

ACATCACCAA GTGTGGAATT GGAGAAAAGC 2220 

AGTTCCAGCA CACCATTGAT GAATTCAAAC 2280 

GGCATGATTC TACCCTTATT CSCCCCAAGA 2340 

AAAATTAGCT TTTAGTTTTA AAACTCTTTA 2400 

TTTTTTCCCA GAGTGCCGTA GTCCTTTTTG 2460 

AGTGAATTAT CCCTGCTGGC TTTCTTTTCT 2520 

GATCTACTCA TTTGCTGACA CATCAGTTAT 2580 

ATGAACAGGA TGTATAATAT GCAATCTTAC 2640 

ATGCTACAAA ACACCTTGTC AACCTCTTCC 2700 

AGCTGTAAAT ATAAATTTTG TCCCTTCCAT 2760 

TTATATAAAG AAAATCAATG AAGGATTTCT 2820 

GTGAAAAATG AGCTTGTAAA TACTCCATTA 2880 

ACAACCTATG TAATTTTTAA AGCAAATATA 2940 

GATACTGTAT CTGGGCTGAT TTTTTAAATA 3000 

ATATTTTAAA GACAACCAGA TGCCAGCATC 3060 

AACATCTATC ATAAGATATA TTTATTTTAA 3120 

TTTGTTTTGA TAACTCATAC CTTAATAATA 3180 

TGACAATGAA CTCACATTCT AATCCAGAAA 3240 

CTGCTACTTT TATAGATTTT ACCCCATTAA 3300 

AGATATATAG CTTTGGAAAT GTCCCAGGCT 3360 

CTATATATAC CATATACAAT GCTTTAATAT 3420 

TAGGGAAATA CTTGAATATA TCATTGAGAA 3480 

GTCAGACTTT ATTAAATAAA GATAGAAGAA 3540 

CATGGGATTC AGTTTCTCTA ATGTTATTTT 3600 

TTCAATTAGT AAAAGTCAAT TTTGGGAAAA 3660 

ATCTGATTAA AATGGATGAA ACAAATTACT 3720 

TGTCAATAGA TGGTGAGTTC AGAACTTATT 3780 

GTAAGAATTA ACATATAGAA TGGTCTGGTC 3840 

AAGTATTGTT TTGTTTGAAA CATGAATTTC 3900 



Seg ID NO: 73 Protein sequence: 
Protein Accession #: NM 005795 



1 11 21 31 41 51 

I I I I I I 

MLYSIFHLGL MMEKKCTLYF LVLLPFFMIL VTAELEESPE DSIQLGVTRN KIMTAQYECY • 60 

QKIMQDPIQQ AEGVYCNRTW DGWIiCWNDVA AGTESMQLCP DYPQDFDPSE KVTKICDQDG 120 

NWFRHPASNR TWTNYTQCNV NTHEKVKTAL NLFYLTIIGH GLSIASLLIS LGIFFYFKSL 180 

SCQRITXiHKN LFFSFVCNSV VTIIHLTAVA NNOALVATNP VSCKVSQFIH LYLMGCNYFW 240 

MLCEGIYLHT LIWAVFAEK QHLMWYYFLG WGFPLIPACI HAIARSLYYN DNCWISSDTH 300 

LLYIIHGPIC AALLVNLFFL LNIVRVLITK LKVTHQAESN LYMKAVRATL ILVPLLGIEF 360 

VLIPWRPBGK IAEEVYDYIM HILMHFQGLL VSTTFCFFNG EVQAILRRNW NQYKIQFGNS 420 
FSNSEALRSA SYTVSTISDG PGYSHDCPSE HLNGKSIHDI ENVLLKPENL YN 



Seq ID NO: 74 Nucleotide sequence: 
Nucleic Acid Accession #: NM_000450.1 

Coding sequence: 117.. 1949 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CCTGAGACAG AGGCAGCAGT GATACCCACC TGAGAGATCC TGTGTTTGAA CAACTGCTTC 60 

CCAAAACGGA AAGTATTTCA AGCCTAAACC TTTGGGTGAA AAGAACTCTT GAAGTCATGA 120 

TTGCTTCACA GTTTCTCTCA GCTCTCACTT TGGTGCTTCT CATTAAAGAG AGTGGAGCCT 180 

GGTCTTACAA CACCTCCACG GAAGCTATGA CTTATGATGA GGCCAGTGCT TATTGTCAGC 240 

AAAGGTACAC ACACCTGGTT GCAATTCAAA ACAAAGAAGA GATTGAGTAC CTAAACTCCA 300 

TATTGAGCTA TTCACCAAGT TATTACTGGA TTGGAATCAG AAAAGTCAAC AATGTGTGGG 360 

TCTGGGTAGG AACCCAGAAA CCTCTGACAG AAGAAGCCAA GAACTGGGCT CCAGGTGAAC 420 

CCAACAATAG GCAAAAAGAT GAGGACTGCG TGGAGATCTA CATCAAGAGA GAAAAAGATG 480 

TGGGCATGTG GAATGATGAG AGGTGCAGCA AGAAGAAGCT TGCCCTATGC TACACAGCTG 540 

CCTGTACCAA TACATCCTGC AGTGGCCACG GTGAATGTGT AGAGACCATC AATAATTACA 600 



214 



WO 02/079492 



PCTAJS02/04915 



CTTGCAAGTG TGACCCTGGC TTCAGTGGAC TCAAGTGTGA GCAAATTGTG AACTGTACAG 660 

CCCTGGAATC CCCTGAGCAT GGAAGCCTGG TTTGCAGTCA CCCACTGGGA AACTTCAGCT 720 

ACAATTCTTC CTGCTCTATC AGCTGTGATA GGGGTTACCT GCCAAGCAGC ATGGAGACCA 780 

TGCAGTGTAT GTCCTCTGGA GAATGGAGTG CTCCTATTCC AGCCTGCAAT GTGGTTGAGT 840 

GTGATGCTGT GACAAATCCA GCCAATGGGT TCGTGGAATG TTTCCAAAAC CCTGGAAGCT 900 

TCCCATGGAA CACAACCTGT ACATTTGACT GTGAAGAAGG ATTTGAACTA ATGGGAGCCC 960 

AGAGCCTTCA GTGTACCTCA TCTGGGAATT GGGACAACGA GAAGCCAACG TGTAAAGCTG 1020 

TGACATGCAG GGCCGTCCGC CAGCCTCAGA ATGGCTCTGT GAGGTGCAGC CATTCCCCTG 1080 

CTGGAGAGTT CACCTTCAAA TCATCCTGCA ACTTCACCTG TGAGGAAGGC TTCATGTTGC 1140 

AGGGACCAGC CCAGGTTGAA TGCACCACTC AAGGGCAGTG GACACAGCAA ATCCCAGTTT 1200 

GTGAAGCTTT CCAGTGCACA GCCTTGTCCA ACCCCGAGCG AGGCTACATG AATTGTCTTC 1260 

CTAGTGCTTC TGGCAGTTTC CGTTATGGGT CCAGCTGTGA GTTCTCCTGT GAGCAGGGTT 1320 

TTGTGTTGAA GGGATCCAAA AGGCTCCAAT GTGGCCCCAC AGGGGAGTGG GACAACGAGA 1380 

AGCCCACATG TGAAGCTGTG AGATGCGATG CTGTCCACCA GCCCCCGAAG GGTTTGGTGA 1440 

GGTGTGCTCA TTCCCCTATT GGAGAATTCA CCTACAAGTC CTCTTGTGCC TTCAGCTGTG 1500 

AGGAGGGATT TGAATTATAT GGATCAACTC AACTTGAGTG CACATCTCAG GGACAATGGA 1560 

CAGAAGAGGT TCCTTCCTGC CAAGTGGTAA AATGTTCAAG CCTGGCAGTT CCGGGAAAGA 1620 

TCAACATGAG CTGCAGTGGG GAGCCCGTGT TTGGCACTGT GTGCAAGTTC GCCTGTCCTG 1680 

AAGGATGGAC GCTCAATGGC TCTGCAGCTC GGACATGTGG AGCCACAGGA CACTGGTCTG 1740 

GCCTGCTACC TACCTGTGAA GCTCCCACTG AGTCCAACAT TCCCTTGGTA GCTGGACTTT 1800 

CTGCTGCTGG ACTCTCCCTC CTGACATTAG CACCATTTCT CCTCTGGCTT CGGAAATGCT 1860 

TACGGAAAGC AAAGAAATTT GTTCCTGCCA GCAGCTGCCA AAGCCTTGAA TCAGACGGAA 1920 

GCTACCAAAA GCCTTCTTAC ATCCTTTAAG TTCAAAAGAA TCAGAAACAG GTGCATCTGG 1900 

GGAACTAGAG GGATACACTG AAGTTAACAG AGACAGATAA CTCTCCTCGG GTCTCTGGCC 2040 

CTTCTTGCCT ACTATGCCAG ATGCCTTTAT GGCTGAAACC GCAACACCCA TCACCACTTC 2100 

AATAGATCAA AGTCCAGCAG GCAAGGACGG CCTTCAACTG AAAAGACTCA GTGTTCCCTT 2160 

TCCTACTCTC AGGATCAAGA AAGTGTTGGC TAATGAAGGG AAAGGATATT TTCTTCCAAG 2220 

CAAAGGTGAA GAGACCAAGA CTCTGAAATC TCAGAATTCC TTTTCTAACT CTCCCTTGCT 2280 

OGCTGTAAAA TCTTGGCACA GAAACACAAT ATTTTGTGGC TTTCTTTCTT TTGCCCTTCA 2340 

CAGTGTTTCG ACAGCTGATT ACACAGTTGC TGTCATAAGA ATGAATAATA ATTATCCAGA 2400 

GTTTAGAGGA AAAAAATGAC TAAAAATATT ATAACTTAAA AAAATGACAG ATGTTGAATG 2460 

CCCACAGGCA AATGCATGGA GGGTTGTTAA TGGTGCAAAT CCTACTGAAT GCTCTGTGCG 2520 

AGGGTTACTA TGCACAATTT AATCACTTTC ATCCCTATGG GATTCAGTGC TTCTTAAAGA 2580 

GTTCTTAAGG ATTGTGATAT TTTTACTTGC ATTGAATATA TTATAATCTT CCATACTTCT 2640 

TCATTCAATA CAAGTGTGGT AGGGACTTAA AAAACTTGTA AATGCTGTCA ACTATGATAT 2700 

GGTAAAAGTT ACTTATTCTA GATTACCCCC TCATTGTTTA TTAACAAATT ATGTTACATC 2760 

TGTTTTAAAT TTATTTCAAA AAGGGAAACT ATTGTCCCCT AGCAAGGCAT GATGTTAACC 2820 

AGAATAAAGT TCTGAGTGTT TTTACTACAG TTGTTTTTTG AAAACATGGT AGAATTGGAG 2880 

AGTAAAAACT GAATGGAAGG TTTGTATATT GTCAGATATT TTTTCAGAAA TATGTGGTTT 2940 

CCACGATGAA AAACTTCCAT GAGGCCAAAC GTTTTGAACT AATAAAAGCA TAAATGCAAA 3000 

CACACAAAGG TATAATTTTA TGAATGTCTT TGTTGGAAAA GAATACAGAA AGATGGATGT 3060 

GCTTTGCATT CCTACAAAGA TGTTTGTCAG ATGTGATATG TAAACATAAT TCTTGTATAT 3120 

TATGGAAGAT TTTAAATTCA CAATAGAAAC TCACCATGTA AAAGAGTCAT CTGGTAGATT 3180 

TTTAAOGAAT GAAGATGTCT AATAGTTATT CCCTATTTGT TTTCTTCTGT ATGTTAGGGT 3240 

GCTCTGGAAG AGAGGAATGC CTGTGTGAGC AAGCATTTAT GTTTATTTAT AAGCAGATTT 3300 

AACAATTCCA AAGGAATCTC CAGTTTTCAG TTGATCACTG GCAATGAAAA ATTCTCAGTC 3360 

AGTAATTGCC AAAGCTGCTC TAGCCTTGAG GAGTGTGAGA ATCAAAACTC TCCTACACTT 3420 

CCATTAACTT AGCATGTGTT GAAAAAAAAA GTTTCAGAGA AGTTCTGGCT GAACACTGGC 3480 

AACGACAAAG CCAACAGTCA AAACAGAGAT GTGATAAGGA TCAGAACAGC AGAGGTTCTT 3540 

TTAAAGGGGC AGAAAAACTC TGGGAAATAA GAGAGAACAA CTACTGTGAT CAGGCTATGT 3600 

ATGGAATACA GTGTTATTTT CTTTGAAATT GTTTAAGTGT TGTAAATATT TATGTAAACT 3660 

GCATTAGAAA TTAGCTGTGT GAAATACCAG TGTGGTTTGT GTTTGAGTTT TATTGAGAAT 3720 

TTTAAATTAT AACTTAAAAT ATTTTATAAT TTTTAAAGTA TATATTTATT TAAGCTTATG 3780 
TCAGACCTAT TTGACATAAC ACTATAAAGG TTGACAATAA ATGTGCTTAT GTTT 



Seq ID NO: 75 Protein sequence: 
Protein Accession #: NP_000441 



1 11 21 31 41 51 

11)111 

MIASQFLSAL TLVLLIKESG AWSYNTSTEA MTYDEASAYC QQRYTHLVAI QNKEEIEYLN 60 

SILSYSPSYY WIGIRKVNNV WVWVGTQKPL TEEAKNWAPG EPNNRQKDED CVEIYIKREK 120 

DVGMWNDERC SKKKLALCVT AACTOTSCSG HGECVBTINN YTCKCDPGFS GLKCEQIVNC 180 

TALES PEHGS LVCSHPLGNF SYNSSCSISC DRGYLPSSME TMQCMSSGEW SAPIPACNW 240 

ECDAVTNPAN GFVECFQNPG SFPWHTTCTF DCEEGPBLMG AQSLQCTSSG NWDNEKPTCK 300 

AVTCRAVRQP QNGSVRCSHS PAGEFTFKSS CKFTCEBGPM LQGPAQVECT TQGQWTQQIP 360 

VCEAFQCTAL SNPERGYMNC IiPSASGSFRY GSSCEFSCEQ GFVLKGSKRL QCX3PTGEWDN 420 

EKPTCEAVRC DAVHQPPKGL VRCAHSPIGE PTYKSSCAPS CEEGFBLYGS TQLECTSQGQ 480 

WTEEVPSCQV VKCSSLAVPG KINMSCSGEP VFGTVCKFAC PEGWTLNGSA ARTCGATGHW 540 

SGLLPTCEAP TESNIPLVAG LSAAGLSLLT LAPFLLWLRK CLRKAKKFVP ASSCQSLESD 600 
GSYQKPSYIL 



Seq ID NO: 76 N^cleotig> sequence; 
Nucleic Acid Accession #: NM_031439 

Coding sequence: 69.. 1235 (underlined sequences correspond to start and stop codons) 
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10 



15 



20 



25 



30 



35 



40 



CCCGACCCGT 
GTGCGGCCAX 
CCCTGGACGC 
ACAAGGGCTC 
ACGAGAGGAA 
TGCTGGGAAA 
CGGAGCGGCT 
GGAAGAAGCA 
TCTCCCGGGA 
AGAAGGAGGA 
ACCACGAGGG 
CGTACGGGCT 
CCTTCTTCTC 
CAGGGCACCC 
GCTCCCTGGC 
CCCCATCTCC 
CCCACCTGGG 
TGAGCCAGGT 
CTCCTGGCCA 
CCCAGGTGAC 
CGGCCACGTA 
CCCTCGCGCC 
TTTCCTCCCA 
GCTGGACTCT 
GCCCACATTT 
CTCCCAGTGG 
GACAGACTTG 
ATTAATAAAG 
TGATAATTTT 
TCCAAAGTGA 
GATTTGAGAA 
TATTTTATTT 
ACACACACTT 
TATAAAATTC 
GATGTTTAAA 



11 
I 

GCGAGGGCCA 
GGCTTCGCTG 
CGAGCTGTCG 
CGAGAGCCGT 
ACGGCTGGCA 
GTCGTGGAAG 
GCGCCTGCAG 
GGCCAAGCGG 
CCAGAACGCC 
CAGGGGTGAG 
GCCGGCTGGT 
GCCCACACCT 
CTCCCCCTGC 
GTACTCACCG 
CCTTGGCCAG 
TGCCTATTAC 
CCAGCTTTCC 
GGAACTCCTG 
CCCAGACTCC 
ACCAACGGGT 
CTACAACAGC 
CTCTCCTTCT 
CCGCTCAGGG 
CCTTATCCGA 
TAAGTATATT 
AATGTTCACT 
ATAGCCAAGG 
GAAGATGGGG 
GTGTGCACAG 
CCACAAAATT 
ATTAACCAGT 
TAAATATACA 
CAAGAGCCAC 
AGTGTATTAG 
AACAAAACAG 



21 
I 

GGTCCGCGCC 
CTGGGAGCCT 
GATGGACAAT 
ATCCGGCGGC 
GTGCAGAACC 
GCGCTGACGC 
CACATGCAGG 
CTGTGCAAGC 
CTGCCGGAGA 
TACTCCCCCG 
GGTGGCGGCG 
CCTGAAATGT 
CAGGAGGAGC 
GAGTACGCCC 
TCCCCCGGCG 
TCCCCGGCCA 
CCGCCTCCTG 
GGGGACATGG 
GCCACAGGGG 
CCCACAGAGA 
TACAGTGTGT 
TGTGCCTTGA 
CAGGGAGGTC 
GTGCCGCCTC 
CCTTCAAGTG 
GACGrCTTTT 
TCCCTTCTGG 
AAATTTGACX 
CCCAAGGACC 
TCAAAGGGAC 
ATGGCTAACT 
TTTTAAAGCA 
CGCGCCCAGC 
TTTCATTACA 
GCTGTTGTAA 



31 
I 

TGCCCCGCCA 
ACCCTTGGCC 
CGCCGCCGGC 
CCATGAACGC 
CGGACCTGCA 
TGTCCCAGAA 
ACTACCCCAA 
GCGTGGACCC 
AGAGAAGCGG 
GCACTGCCCT 
GCGGCACCCC 
CTCCCCTGGA 
ATGGCCATCC 
CAAGCCCTCT 
TCTCCATGAT 
CCTACCACCC 
AGCACCCTGG 
ATCGCAATGA 
CCATGGCCCT 
CCAGCCTCAT 
C ATAGA GCTG 
GTGGCAGAGG 
TGAACTGCGG 
TATCCCCTTC 
AGTTTTCCTC 
CTTGGTAGCC 
TCCAGTTTTC 
CATTAATGAG 
ACGAGGCTTT 
TCATACAATT 
ATATCACAGA 
GTTCTTTTTT 
CTACATTTAT 
TAGGAGAAAT 
AAAAAAAAAA 



41 

I 

GGCGAAGCGA 
CGAGGGTCTC 
CGTCCCCCGG 
CTTCATGGTT 
CAACGCCGAG 
GAGGCOGTAC 
CTACAAGTAC 
GGGCTTCCTT 
CAGCCGGGGG 
GCCCAGCCTC 
GAGCAGTGTG 
CGTGCTGGAG 
CCGCCGCATC 
CCACTGTAGC 
GTCCCCTGTA 
ACTCCACTCC 
CTTCGACGCC 
ATTCGACCAG 
CAGTGGGCAT 
CTCCGTCCTG 
GAGGCGCCCC 
AGCCGTCCAG 
CCCCAGAGCC 
CCCACGTTCC 
CAGCCCCTGA 
ATCATCGAAA 
TGATTTAGGG 
CTCGCTAACC 
CTGCACTTTC 
TGAGAAAAAA 
AAATGGGATT 
TTTGTTAATT 
AATTTTCATT 
TATATTTCTA 
AAAAAAAAA 



51 
I 

GGCGACCCGC 
GAGTGCCCGG 
CCCCCGGGGG 
TGGGCCAAGG 
CTCAGCAAGA 
GTGGACGAGG 
CGGCCGCGCA 
CTGAGCTCCC 
GCGCTGGGGG 
CGGGGCTGCT 
GACACGTACC 
CCGGAGCAGA 
CCCCACCTGC 
CACCCCCTGG 
CCCGGCTGTC 
AACCTCCAAG 
CTGGATCAAC 
TATTTGAACA 
GTTCCGGTCT 
GCTGATGCCA 
GTCCGGTCAG 
CCACACCAGC 
TTTGGCCTAA 
AGCCCCTGCA 
GAGTTGCTGT 
CTAATGGGGG 
TTCTCTCAAG 
TACGATCTGG 
TGCACCCCCT 
CAGTCAACCT 
GAGTTAAAAC 
TGTTTATTAT 
CTCTTTTACC 
AACATTTTAT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1060 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620. 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



45 



Seq ID NO: 77 Protein sequence; 
Protein Accession #: NP 113627 



11 



21 



31 



41 



51 



50 



55 



60 



65 



70 



75 



MASLLGAYPW 
KRLAVQNPDL 
QAKRLCKRVD 
GPAGGGGGGT 
PYSPBYAPSP 
GQLSPPPEHP 
TPTGPTBTSli 



PEGIjECPALD 
HNABLSKMLG 
PGFLLSSLSR 
PSSVDTYPYG 
LHCSHPLGSli 
GFDALDQLSQ 
ISVLADATAT 



AELSDGQSPP 
KSWKALTLSQ 
DQNALPBKRS 
LPTPPEMSPL 
ALGQSPGVSM 
VELLGDMDRN 
YYNSYSVS 



AVPRPPGDKG 
KRPYVDEAER 
GSRGALGEKE 
DVLBPEQTFP 
MSPVPGCPPS 
EPDQYLNTPG 



SESRIRRPMN 
LRLQHMQDYP 
DRGBYSPGTA 
SSPCQEEHGH 
PAYYSPATYH 
HPDSATGAMA 



AFKVWAKDER 
NYKYRPRRKK 
LPSLRGCYHE 
PRRIPHLPGH 
PLHSNLQAHL 
LSGHVPVSQV 



60 
120 
180 
240 
300 
360 



Seq ID NO: 78 Nucleotide sequence: 
Nucleic Acid Accession #: XM_035787 

Coding sequence: 329. .949 (underlined sequences correspond to start and stop codons) 



TGCCCCGCCC 
AGGAGGAGGG 
CCTCCTCCTC 
CCGCTCCCGG 
GCAGCCGCGG 
AATCCTTTGG 
TTTGGCTTTA 
TCTACAGTTC 
TCGGAAGAAG 
GCAGGAGGAC 
TGTTTTCTCT 
TTTAAGAGTA 
AGAAGATAAA 
TAACTACGTG 



11 
I 

CGCTCCCCAG 
TGGATCTCCC 
CTCCTCCAGC 
OGCGGGGCCT 
CTTCCGGAGC 
TGAAAACTGA 
CACAAAGTCA 
ATGTACGATG 
GTAGTGCTAG 
TACGCTGCAA 
ATTACAGAAA 
AAAGAAGATG 
AGACAGGTTT 
GAAACATCTG 



21 
I 

OGCCCCGGAA 
CAGAGCAAAG 
CGCCCAGGCT 
TCCAGGCGAC 
CCTCGGGGCG 
GACACAAAAT 
TCATGGTGGG 
AGTTTGTGGA 
ATGGGGAGGA 
TTAGAGACAA 
TGGAATCCTT 
AGAATGTTCC 
CTGTAGAAGA 
CTAAAACACG 



31 
I 

GTGATCTGTG 
CGTCGGAGTC 
CCCCCGCCAC 
AAGGACCGAG 
GCGGACTGGC 
GGCTGCAAAT 
CAGTGGTGGC 
GGACTATGAG 
AGTCCAGATC 
CTACTTCCGA 
TGCAGCTACA 
ATTTCTACTG 
GGCAAAAAAC 
AGCTAATGTT 



41 
I 

GCGGCPGCTG 
CTCCTCCTCC 
CCGTCAGACT 
TACCCTCCGG 
TOGOGGTGCA 
AAGCCCAAGG 
GTGGGCAAGT 
CCTACCAAAG 
GATATCTTAG 
AGTGGGGAGG 
GCTGACTTCA 
GTTGGTAACA 
AGAGCTGAGC 
GACAAGGTAT 



51 
I 

CAGAGCCGCC 
TTCTCCTCCT 
CCTCCTTCGA 
CCGGAGCCAC 
GATTCTTCTT 
GTCAGAATTC 
CAGCTCTGAC 
CAGACAGCTA 
ATACAGCTGG 
GGTTCCTCTG 
GGGAGCAGAT 
AATCAGATTT 
AGTGGAATGT 
TTTTTGATTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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AATGAGAGAA ATTCGAGCGA GAAAGATGGA 
GAGGAAAAGT TTAGCCAAGA GAATCAGAGA 
ACTCCTTTCT TATCTTGACC ATACTAATAA 
TTAATTGACT GAAATTACTT TAACATTTTG 
TTGGAACTGC AATGAAA6TC AAATTTACTT 
AGCAAAGTTC AACTTATTTC ATAATTGCCT 
AAGCTIGTGT TTCTTGGGCA GTCTTTCTTG 
TGGGAGGAAA GGTGACTTCC TCTGGTGTTT 
AAATGTCTTG GTCTTCTACT GOCTTGAAAA 
CCACTTTTTT TAACCATTAT TATGCAAAAT 
CATATAGTTA AACTGAGAGT AATTCATCTG 
TTAGAAAAGT GGTGTAAACT TGTACATGGA 
GAAAAATATC TGGTTATATC ATTCTGGGTG 
CCATGTGTCC TGGTGAGAAA ATATATGCCT 
AAGTAACTGT CCGCTAGAAG TCTGTCCAAA 
AAAATAAGAT TCCAGAGCTC TTTGATCGCT 
AGGGCCAGCA TATATACTTG CAAGATAATT 
TTTGAATGAA CCCTCCTTTT CTCTGAGATT 
GTGAGCATGT AAGTGTTAAG TTTTTAATCT 
AGTGCTAATG CATTTTGCAC TAGAAOGCTT 
ATTTCTAAAT TTATATTCAT AAAGTTACAG 
TTTCTGTTTC TGTTTATAAT GAAGAACACT 
CCATCAAACC TGGGTATAGT GCAGAAAACG 
TCACCATTGT GTGGTGTACC TGCTGGAAGA 
CAGTGGGAAA TATGCCACTG ACCGATTTTT 
AGTTGATTCA ACAAAGTATT TTTTTCTTTT 
TGTGTTCAGG CATTCCAGGT AACAGGTGTG 
ACTCACTCTT TAGATATTTA CATCCAGCTT 
TGAGATGTAC ATCTTTCATT TCGTATTTCT 
TACCAATGTA ACACTGGCCA GOGGGCCCAG 
TTTAACCAGG GGTCCTAACC ACTAACATTG 
GGTACTGAGG TGCTATGAAG CCAACTGACA 
ACTACCCGAT TTGTTTATTT GCAATTTGAG 



AGACAGCAAA GAAAAGAATG GAAAAAAGAA 900 

AAGATGCTGC ATTTTATAAT CAAAGCCCAA 960 

ATATAATTTA TAAGCATTGC CATTGAAGGC 1020 

GAAATTGTTG TATATCACTA AAAGCATGAA 1080 

TAAAAAGAAA TTAATATGGC TTCACCAAGA 1140 

ACATTTATCA TGGTCCTGAA TGTAGCGTGT 1200 

AAATTGAAGA GGTGAAATGG GGGTGGGGAG 1260 

ATTATAAAGC TTAAATTTTA TATCATTTTA 1320 

ATGACAATTG TGAACATGAT AGTTAAACTA 1380 

TTAGAAGAAA AGTTATTGGC ATGGTTGTTG 1440 

TGAATCTGCT TTAATTACCT GGTGAGTAAC 1500 

ATTTTTTGAA TATGCCTTAA TTTAGAAACT 1560 

TGTTCTTACT GACACCAGGG GTCCGCTGCC 1620 

GGCACAGCTT TTGTATAGAA AATTCTTGAG 1680 

TTTAAAATGT GTGCCATATT CTGGTTCTTG 1740 

TTTAATAAAC TGCAAGTTCA TTTTAAATGA 1800 

TTCAGCTGCA AGGATTCAGC ACCAGTTATG 1860 

CTGGTCCCTG GAAATCCCTT TCTGCTAGTG 1920 

GGGAGCAGGG CATAGGAAGA AAATGTCAGT 1980 

CGGGAAAATA TTCATGCTTG CCATCTGTTC 2040 

TTTGATACAG GAATTATTAG GAGTAATTCT 2100 

GTAGCTACAT TTTCAGAAGT TAACATCAAG 2160 

TGGCACACAC TGACCACACA TTAGGCTGTG 2220 

ATTCTAGCAT GCTACTTGGG GACATAATTT 2280 

TTTTTTTCCT CTTTGCAGTG GGGCTAGGAC 2340 

TTCTCAGTCC TAATTTGAAC AGGTCAAAGA 2400 

TATGTAAAGT TAAAAATAGG CTTTTTAGGA 2460 

CTCATGTTAA ATATTTGTCC TTAAAGGGTT 2520 

CATAGGCTAT GCCATGTGCG GAATTCAAGT 2580 

CAATCTCCAT GTGTACTTAT TACAGTCTTA 2640 

TGACTTTGCT TTGAGACCTT TCCTCTCCTG 2700 

AAGATGCATC ACGTGTCTTA GGCTGATGCC 2760 
CCATTTAAAG ACCAATAAAC TTCCTTTTTT 



Seq ID NO: 79 Protein sequence: 
Protein Accession #: XP_035787 



1 11 21 31 41 51 

I I I I I I 

MAANKPKGQN SLALHKVIMV GSGGVGKSAL TLQFMYDEFV EDYEPTKADS YRKKWLDGE 60 
EVQIDILDTA GQEDYAAIRD NYFRSGEGFL CVFSITEMBS FAATADFEEQ ILRVKEDBNV 120 
PFLLVGNKSD LEDKRQVSVE EAKNRAEQWN VNYVETSAKT RANVDKVFFD LMREIRARKM 180 
EDSKBKNGKK KRKSLAKRIR ERCCIL 



Seq ID NO: 80 Nucleotide sequence: 
Nucleic Acid Accession #: NM_003467 

Coding sequence: 89.. 1147 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GTTTGTTGGC TGCGGCAGCA GGTAGCAAAG TGACGCCGAG GGCCTGAGTG CTCCAGTAGC 60 

CACCGCATCT GGAGAACCAG CGGTTACCAJT GGAGGGGATC AGTATATACA CTTCAGATAA 120 

CTACACCGAG GAAATGGGCT CAGGGGACTA TGACTCCATG AAGGAACCCT GTTTCCGTGA 180 

AGAAAATGCT AATTTCAATA AAATCTTCCT GCCCACCATC TACTCCATCA TCTTCTTAAC 240 

TGGCATTGTG GGCAATGGAT TGGTCATCCT GGTCATGGGT TACCAGAAGA AACTGAGAAG 300 

CATGACGGAC AAGTACAGGC TGCACCTGTC AGTGGCCGAC CTCCTCTTTG TCATCACGCT 360 

TCCCTTCTGG GCAGTTGATG CCGTGGCAAA CTGGTACTTT GGGAACTTCC TATGCAAGGC 420 

AGTCCATGTC ATCTACACAG TCAACCTCTA CAGCAGTGTC CTCATCCTGG CCTTCATCAG 480 

TCTGGACCGC TACCTGGCCA TCGTCCACGC CACCAACAGT CAGAGGCCAA GGAAGCTGTT 540 

GGCTGAAAAG GTGGTCTATG TTGGCGTCTG GATCCCTGCC CTCCTGCTGA CTATTCCCGA 600 

CTTCATCTTT GCCAACGTCA GTGAGGCAGA TGACAGATAT ATCTGTGACC GCTTCTACCC 660 

CAATGACTTG TGGGTGGTTG TGTTCCAGTT TCAGCACATC ATGGTTGGCC TTATCCTGCC 720 

TGGTATTGTC ATCCTGTCCT GCTATTGCAT TATCATCTCC AAGCTGTCAC ACTCCAAGGG 780 

CCACCAGAAG CGCAAGGCCC TCAAGACCAC AGTCATCCTC ATCCTGGCTT TCTTCGCCTG 840 

TTGGCTGCCT TACTACATTG GGATCAGCAT CGACTCCTTC ATCCTCCTGG AAATCATCAA 900 

GCAAGGGTGT GAGTTTGAGA ACACTGTGCA CAAGTGGATT TCCATCACCG AGGCCCTAGC 960 

TTTCTTCCAC TGTTGTCTGA ACCCCATCCT CTATGCTTTC CTTGGAGCCA AATTTAAAAC 1020 

CTCTGCCCAG CACGCACTCA CCTCTGTGAG CAGAGGGTCC AGCCTCAAGA TCCTCTCCAA 1080 

AGGAAAGCGA GGTGGACATT CATCTGTTTC CACTGAGTCT GAGTCTTCAA GTTTTCACTC 1140 

CAG CTAA CAC AGATGTAAAA GACTTTTTTT TATACGATAA ATAACTTTTT TTTAAGTTAC 1200 

ACATTTTTCA GATATAAAAG ACTGACCAAT ATTGTACAGT TTTTATTGCT TGTTGGATTT 1260 

TTGTCTTGTG TTTCTTTAGT TTTTGTGAAG TTTAATTGAC TTATTTATAT AAATTTTTTT 1320 

TGTTTCATAT TGATGTGTGT CTAGGCAGGA CCTGTGGCCA AGTTCTTAGT TGCTGTATGT 1380 

CTCGTGGTAG GACTGTAGAA AAGGGAACTG AACATTCCAG AGCGTGTAGT GAATCACGTA 1440 

AAGCTAGAAA TGATCCCCAG CTGTTTATGC ATAGATAATC TCTCCATTCC CGTGGAACGT 1500 
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TTTTCCTGTT CTTAAGACGT GATTTTGCTG TAGAAGATGG CACTTATAAC CAAAGCCCAA 15S0 
AGTGGTATAG AAATGCTGGT TTTTCAGTTT TCAGGAGTGG GTTGATTTCA GCACCTACAG 1620 
TGTACAGTCT TGTATTAAGT TGTTAATAAA AGTACATGTT AAACTTACTT AGTGTTATG 

Seq ID NO: 81 Protein a ecruence; 
Protein Accession #: NP 003458 



1 11 21 31 41 51 

I I I I I I 

MEGISIYTSD NYTEEMGSGD YDSMKEPCFR BENANFNKIF LPTIYSIIFL TGIVGNGLVI 60 
LVMGYQKKLR SMTDKYRIiHL SVADLLFVIT LPFWAVDAVA NWYFGNFLCK AVHVIYTVNL 120 
YSSVLILAFI SLDRYLAIVH ATNSQRPRKL LAEKWYVGV WIPALLLTIP DFIFANVSKA 180 
DDRYICDRFY PNDLWVWFQ FQHIMVGLIL PGIVILSCYC IIISKLSHSK GHQKRKALXT 240 
TVILILAFFA CWLPYYIGIS IDSFILLEII KQGCEFENTV HKWISITEAL AFFHCCLNPI 300 
LYAFLGAKFK TSAQHALTSV SRGSSLKILS KGKRGGHSSV STBSESSSFH SS 



Seq ID NO: 82 Nucleotide sequence: 
Nucleic Acid Accession #: NM_014959 

Coding sequence: 314.. 1609 (underlined sequences correspond to start and stop codons) 



11 



CTGGTTCTCA 
GGGAAACAAG 
GTGCAAGCAT 
ATTTGACAAA 
CCGAGACGGG 
TTTTTCCAAG 
TGTCAACTAT 
TTCCTCTTAT 
TCTGGGGCCT 
CGTTTGGTTC 
AAGGGATGAG 
CCTGCAGCAC 
AGAGGAGGCT 
CGTCTCCTGG 
CCGGGTGGAG 
GCTGCGGATC 
TTATCACCCC 
GCTAACAAAG 
GCCCCCAATG 
GAAAGTAATG 
CTCAAAATTC 
ACATGGGACT 
ATCAGCCCCT 
AGCCAGGATG 
TGAGAATGAG 
GCTGAGCATG 
TGAAAGGGAC 
TAGGTAGTCT 
TCATTGATTT 
AGGACAGACT 
AATGTCTGAA 
CACTATTTTA 
• GTGTTTATGA 
CCTATGGACA 
AAGCTTCTCA 
GTTAGGACTT 
TGAAAGAGAG 
GTGTGTAGCC 
CTGAGATGAA 
CAGCTTATTC 
ACATTTGGAT 
ATGTTTTCGT 
TAGTAGACAC 
AGTCTCTTTT 
ATTTATCCTC 
CCACTTAAAG 
TAGGATATAG 
ATATATTTTC 
TTATGTTTTT 
GTTCTGCAAA 
TAGGTACAGA 
TCAGTGATGT 



ACTTCTTTTG 
CTCTCAGGAC 
CTGGGCCATC 
TGGAAAAAAA 
TATACAGGGA 
AAGATGATGA 
CTGGGGGGGA 
GCTTCTAAAG 
GAAGGAAATG 
CCCACTGCTG 
GTCACAGTGA 
CATGAACAGT 
GTCGCCGAAA 
TTTCTCGTTG 
CCTTTCTATG 
GCCAGTGGGA 
CACCCCGAAG 
GCGATAGATG 
GAACCCCTGA 
CCCAAGGAGT 
TATGCTGGGC 
TTGGTGTGGG 
CCTCCTTTCT 
GGGGACCTGA 
AAGGAGCTGG 
GTGGAGAAGA 
CCTTACCTCG 
GGAAGAGAGA 
CAGTGTTCAA 
TGTAACCTGG 
GAAGGTAGTA 
TCCATTGACA 
AGGATGGGGC 
CTCCGTTTGA 
GCTGAGGACA 
TAACACTTTA 
CACAAAAATG 
TGCTACAAGG 
ATGTGGTAAA 
TGAGACAGAC 
TCTACCATAG 
GGCACACATA 
CATCTGGTTG 
AACTGGAAGA 
AGTCAGCCAG 
TGGAATCTAG 
TTATCTTCTA 
TACTTTTAAT 
CCAGATATTT 
TTTCCACTTC 
TTCTCTTATT 
TTTTTGCTTG 



21 
I 

AAATAATGTT 
TTCCGGTCGC 
TTCAATGGTA 
GGAGTGTCCA 
GCTACCCTGT 
GACAGAGGCA 
CATTCCCAGG 
TCTGTTTTGA 
TGGATGTTGA 
GCTGGTATCT 
CGATTGCGTT 
GGCTGGTGGG 
TCCACCTCCC 
CCCATTTTAA 
CTGTCCTGGA 
CTCGCCTCTC 
ATATTAAGTT 
ATGAGGAAGA 
ACTTTGGTTC 
TGAAATTGTC 
AGATGAAGGA 
ATACTGAGGT 
CAGGTGCAGC 
AAGGGGTGCT 
TGGAGCAGGA 
AAGGGGACCT 
TGTCCTATCT 
ATCCAGOGTT 
GACAGAAGAA 
CATGTACCTA 
AT ATTC CTTT 
TGATTCTTGA 
CTGGAAAGGC 
AGTATCACCT 
CTCAAGGCAT 
TCTATGGCTA 
GGAGAAAATG 
AGTTGTTGGG 
TCAACTCCAC 
ATTCCTGGCA 
ACTCTGTCAT 
TTTCCATCCT 
AGTCAGTTTT 
TTTCAATTAC 
TTTGTTATGT 
GCACTTTATC 
CATAATCTTT 
CACTCAGAAG 
ACCATTTCTG 
TTCTGATAGA 
TTTTGCTTCC 
TAGTATTTTT 



31 
I 

CATAGAGAAG 
CATGATGGCT 
AAAAAGATAC 
GAAAAGAGTA 
GTTTCTGAGA 
GAGCCATTAT 
AGACATTTGC 
GATOGAAGAA 
GTTGATTGAT 
GTGGTCAGCC 
TGGTTCCTGG 
CGGCCCCTTG 
CCACTTCATC 
GAATGAAGGG 
AAGCCCCAGC 
CATCCCCATC 
CCACTTGTAC 
TCGCTTCCAT 
CAGTTATATT 
CTACAGGAGC 
ACCCATTCAA 
GAAGCCAGTG 
CTTTGTGAAG 
CGATGATCTC 
AAAGACACGG 
GGCCCTGGAC 
TAGACAGCAG 
CTCATTGGAA 
GACTGGGTAA 
TTGACTGTAT 
TAAATTTTTT 
AGACCCAGGA 
AACTTTTCCT 
TCTCATAACT 
ACATGATGAC 
CTGTTATTAG 
CAAACATGAG 
TTAAATGTTC 
AGAACCACCA 
ATGTACCATA 
TTTGTAGCCA 
TTTATGTTTA 
TTTTATGGTG 
TTAOGTTAAT 
CTTTTCTATT 
ACCATTTAGA 
CTGTATCTTA 
ATTTAAAAAA 
TTGCTCTTCC 
CGTTTTTTAG 
TCTGAGGACA 
AGTTGACATT 



41 
I 

GAGGGCTGTC 
GTGGGCGGTA 
AGTAAAGACA 
GCAGCAGTGA 
CCCTTTGTGA 
TGTTCCGTGC 
TCAGAAGAGA 
GATTATAAAA 
AAGAGCACAA 
ACAGGCCTCG 
AGTCAGCACC 
TTTGATGTCA 
TCCCTCCAAG 
ATGGTCCTGG 
TTCTCTCTGA 
ACTTCCAACA 
CTTGTCCCCA 
GGTGTGCGCC 
GTGTCTAATT 
CCTGGAGAAA 
CTTGAGATTA 
GATCTCCAGC 
GAGAACCACC 
CAGGACAATG 
CAGAGCAAGA 
GTGCTCTTCA 
AATTTG TAAA 
ATGGATAAAC 
CATCTATCAC 
CCTCATGCAT 
CCAACCATTG 
TAAAGGACAT 
GATTAATGTG 
AAAAGCAGAA 
AGTCTTTTTT 
AACAATGTAA 
CAGAAAATAT 
ATGGTCAACT 
AAAAGAAAAT 
CAAAAAATAA 
TTTCAGCTGT 
ATCTGTTTAA 
TATTTTGAAC 
GTAATTATTA 
CTACTGTTAT 
TCCTATTACC 
AAACCCATCA 
CTCATGAGAA 
TTCATTATTT 
TTCTTTTAGA 
TCTTTTTCTC 
GTTTTCTGTT 



51 

I 

TGAGATTCGA 
AACGCGGTTA 
TAAATACCAC 
GGAAGAGCTG 
CATCTCACAT 
TGTTCCTGAG 
ATCAAATAGT 
ATCGTCAGTT 
ACAGATACAG 
GCTTCCTGGT 
TGGCCCTGGA 
CTGCAGAGCC 
GTGAGGTGGA 
AGCATCCAGC 
TGGGCATCCT 
CATTGATCTA 
GCGACGCCTT 
TGCAGACTTC 
CTGCTAACCT 
TTCAGCACTT 
CTGAAAAAAG 
TTGTAGCTGC 
GGCAACTCCA 
AGGTTCTTAC 
ATGAGGCCTT 
GAAGCATTAG 
ATGAGTCAGT 
AGAAATGTGA 
ACAGGCTTTC 
TTTCCTCAAG 
CTTGATATAT 
CCGGATAGGT 
AAAAATAATT 
AAGCTAACAA 
TTTTTTGTAT 
ATGTATTTGC 
TTTCCCACTG 
CCAAGGAATA 
GAGGGTAATT 
GCCAACTCTG 
CTTTTGATTA 
AACAAGTTCC 
CCATTCTGAT 
ATATGTTAGG 
CACATTTGTA 
TTTTCTCATC 
ATAAATTATT 
GAGTAATCTG 
TCCAAATTTC 
GTGGTTCTGA 
ACCTTCATTC 
CAGCAGTTTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1600 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
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10 



15 



20 



25 



30 



35 



CTTTTAGCTT 
TATGTAGTGT 
TCATTATGTT 
AATTTGCAGT 
CTTTTTATTA 
TAGCTTATAT 
TTTTATATCC 
GTTTGAAATC 
AATGGAAACC 
TTGAATCAAT 
AATGTTAGAC 
TTCATTTTTT 
AGTGTTCACC 
TGGTCACTGT 
GGCTTATATT 
TCAAGACAGG 
CTGCAACCTC 
ATTACAGGCA 
TTTGCTATGT 
CCTCCCAAAG 
TTTTTTTTTT 
CTCAGCTCAC 
AGTAGCTAGG 
TAGTAGAGAC 
CCACCCGCCT 
CATTTGAGTA 
GTTATTCAGT 
GCTCATCCTT 
TGAGACTCTG 
AACTGAGGTC 
TGTGTGTGCC 
AATGTGAATT 
AAAGTCTTTA 



CCGTATTTCC 
GTCATTTTTC 
GGGGATGAGT 
TTTATGTCTT 
GCCTGATTTT 
GCCCAGAAGG 
TTCTCATGTC 
AACAGTCTTG 
TCTTACATGT 
CAATATTATA 
TTTTATGTTT 
CCATTATTCT 
TGTTGTTGTC 
ATGTATCAGT 
CTATTTTCCT 
GTCTCAACTC 
TGCCTCCTGG 
TGCACCACCA 
TGGCCAGGCT 
TGCTGGGATT 
TTGAGATGGA 
TGCAGCCTCT 
ACTACAGGTG 
AGGGTTTCAC 
CGGCCTTCCA 
TTTTTATAAT 
GTTTGGTGTC 
GTATTCTCAG 
TTTTATTTGT 
TTAATATCAG 
TATGAGATTG 
AGGACCAGCG 
TATGCTCAG 



TGATGAGAAA 
TGTCAGATTT 
TTCCTTGTTT 
TTACCAAACT 
CATCTTTATA 
CCTTCAAAAT 
TTCTACTGTA 
AGAATAGATG 
GATTTTCCTT 
TTTTGTTTTT 
TCCTAAATGT 
GATTGGGTAA 
TGTGTCGTCC 
TCTAAAATTT 
GCAAATGTGT 
TGTTACCCAG 
TTCAAGCGAT 
CAGCCCAGCT 
GGTTTTGAAC 
ACAGGCCACT 
GTCTOGCTCT 
GTCTCCCGGG 
CATGCCAACA 
CATTTTGGCC 
AAGTGCTGGG 
GTCTCTTTTA 
CACTGAGTTG 
TAGTTCCGAT 
ATCCAACAGA 
CTCATTTTAA 
GGTGCAGTGT 
CAATGAATGC 



TCTGCAGTCA 
CAAGGTATTT 
TATTCCCTTT 
TAGAGGTTTT 
GGAAATAGTT 
AAGAATTTTG 
AAATTCATAT 
AAAATTTTGA 
GCCATCTAGA 
TTCCTCCTCT 
CCCTGATATT 
TTTTAATTTG 
CACTGAGTGC 
CCATTTTGTT 
CAGCATTTGC 
GCTGGAGTGC 
TATTGTGCCT 
AATTTTTTGT 
TCCTGGCCTC 
ACACCTGGCA 
GTCATCTAGG 
CTCAAGCGAT 
CGCCCGGCTA 
AGGATGGTCT 
ATTACAGGCA 
AAGTCTTTGT 
TCATTTGCCA 
ATGTACCCTC 
AGATGTTTAT 
AAGTCTTTGC 
ATCCTGTTAG 
TCAAGTTGGG 



TTCAAATTGT 
ATCTTTAGTT 
GGAATTTGCT 
CAGCCTAATT 
TAAGTGATGA 
AAAGAATACA 
GCTTTGCTAC 
TGAATAGTGG 
AATAAACCAT 
TCTGAGACTC 
CTACTTATTT 
TCTATTTTCA 
ATTCACCACC 
CTCTATATTT 
TTGTTTGAGC 
AGTGGTGCGA 
CAGCCTCCTG 
ATTTTTAGTA 
AAGTGATCCA 
CATTTGAGTA 
CTGGAGTGCA 
TCTCTTGCCT 
ATTTTTTTAA 
CGATCTCCTG 
TGAGCCACCG 
CAGATAATTC 
GACAAGTGGA 
GACATGTGAA 
TATTTATTTG 
AGTGGTATTC 
CTCCATTCTC 
GTTGGGCGTT 



TGTTTCCCTG 
TTTAGCCATT 
CCAATTCATA 
TCTAAAAATA 
CAAGTTCCAA 
GAAAACAAAC 
TCTAAACCTA 
AATTCTTTTA 
AGTATTTATG 
TTATTGTGGA 
AGAACATCTT 
AATTTGCTGG 
TTTTAAATTT 
TAAATTTCTT 
TTTTTTTTTT 
TCTCAGCTCA 
AGTAGCTGGG 
GAGACAGAGT 
CCCACCTCAG 
TTTTTTTTTT 
GTGGTGTGAT 
CAGCCTCCTG 
AAAATATTTT 
ACCTCATGAT 
TGCCTGGCCT 
CACTGTACAT 
GATTTTTGCA 
TGTTATCTTA 
GCTTTCTGTG 
GGATCTATCC 
AGGGCGTTTG 
AGAATTCATA 



3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 



Seq ID NO: 83 Protein sequence; 
Protein Accession #: NP 055774 



40 



45 



50 



i 

I 

MMRQRQSHYC 
GNVDVELIDK 
EQWLVGGPLF 
FYAVLESPSF 
IDDEEDRFHG 
AGQMKEPIQL 
DLKGVLDDLQ 
YLVSYLRQQN 



11 
I 

SVLFLSVNYL 
STNRYSVWFP 
DVTAEPEEAV 
SLMGILLRIA 
VRLQTSPPMB 
EITEKRHGTL 
DNEVLTENEK 
L 



21 
I 

GGTFPGDICS 
TAGWYLWSAT 
AEIHLPHFIS 
SGTRLSIPIT 
PLNPGSSYIV 
VWDTBVKPVD 
ELVEQEKTRQ 



31 
I 

EENQIVSSYA 
GLGFLVRDEV 
LQGEVDVSWF 
SNTLIYYHPH 
SNSANLKVMP 
LQLVAASAPP 
SKNEALLSMV 



41 


51 




1 

SKVCFEIEED 


1 

YKNRQFLGPE 


60 


TVTIAFGSWS 


QHLALDLQHH 


120 


LVAHFKNEGM 


VLEHPARVEP 


180 


PEDIKFHLYL 


VPSDALLTKA 


240 


KELKLSYRSP 


GEIQHFSKFY 


300 


PFSGAAFVKE 


NHRQLQARMG 


360 


EKKGDLALDV 


LFRSISERDP 


420 



55 



60 



65 



70 



75 



Seq ID NO: 84 Nucleotide sequence: 
Nucleic Acid Accession #: NM_007036 

Coding sequence: 56-610 (underlined sequences correspond to start and stop codons) 



CTTCCCACCA 
GAGCGTCTTG 
TAATTATGCG 
CTGCAAGAGG 
AGAAACTTGC 
GTGTCAGCCT 
TCCCTACGGC 
TGACAGGGGG 
TTCCAACAGA 
GAGAGAAGAA 
TCCACGCTGA 
CACAGCCAAC 
CCAAATTGTG 
ATCCATATGA 
AAATGTGTGT 
AGACAGGTCA 
TCTTTGACTT 
GATGGGGAGG 
TCTAGAATTT 



11 

I 

GCAAAGACCA 
CTGCTGACCA 
GTGGACTGCC 
ACAGTGCTOG 
TACCGCACAG 
TCTAATGGGG 
ACCTTCGGGA 
ACGGGAAAAT 
TTTGTTTCTC 
GTTGTGAAAG 
TCCCGGCTGT 
ATTTTAGGAA 
ATGCATGGTG 
CTGAACACTT 
GTATAGTAAC 
ACCAAAGAGG 
TGATGTACAT 
GGGTGGGAGT 
AATTGTGCTT 



21 
I 

CGACTGGAGA 
CGCTCCTCGT 
CTCAACACTG 
ACGACTGTGG 
TCTCAGGCAT 
AGGATCCTTT 
TGGATTGCAG 
GCCTGAAATT 
TCACGGAGCA 
AGAATGCTGC 
GATTTCTGAG 
CTTTCTAGAT 
GATCCAGAAA 
GTATGTGTTT 
ACTGAAGAAC 
GAGCTAGGCA 
TAATGTTGGG 
GGGAAATAAA 
TTTTTTTTTT 



31 
I 

GCCGAGCCGG 
GCCTGCACAC 
TGACAGCAGT 
CTGCTGCCGA 
GGATGGCATG 
TGGTGAAGAG 
AGAGACCTGC 
CCCCTTCTTC 
TGACATGGCA 
CGGGTCTCCC 
AGAAGGCTCT 
ATAGCATAAG 
ACAAAAAGTA 
GTTAAATATT 
TAAAAATGCA 
AAGCTGAAGA 
ATATGGAATG 
ATATTTAGCC 
TTTGGCTTTG 



41 
I 

AGGCAGCTGG 
CTGGTGGCCG 
GAGTGCAAAA 
GTGTGCGCTG 
AAGTGTGGCC 
TTTGGTATCT 
AACTGCCAGT 
CAATATTCAG 
TCTGGAGATG 
GTAATGAGGA 
ATTTTCGTGA 
TACATGTAAT 
GGATACTTAC 
OGAATGCATG 
ATTTAGGTAA 
CCGCAGTGAG 
AAGACTTAAG 
CTTCCTTGGT 
GGAAAAGTCA 



51 
I 

GAAACATGAA 
CCTGGAGCAA 
GCAGCCCGCG 
CAGGGCGGGG 
CGGGGCTGAG 
GCAAAGACTG 
CAGGCATCTG 
TAACCAAGTC 
GCAATATTGT 
AATGGTTAAA 
TTGTTCAACA 
TTTTGAAGAT 
AATCCATAAC 
TAGATTTGTT 
TCTTACATGG 
TCAAATTAGT 
AGCAGGAGAA 
AGGTAGCTTC 
AAATAAAACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
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10 



15 



ACCAGAAAAC 
AGCTTTGAAC 
TGAAGGAC6G 
CCACCTCAGA 
GTAAATATTT 
ATTTATCATC 
AACCTATGAC 
TATAGGAGTC 
TAAACATAAG 
GGAGGTTTGT 
TGGAATTAGG 
CTTAGGAAAT 
AGTATTTACC 
GCCTTTGAAT 
TTGTTCAATA 



CCCTGAAGGA 
TGAGAGCAAT 
TTCTGGGGCA 
GATAAATCTA 
ATATATTTTT 
CTCTTGAGGA 
TCTATAAGGT 
ACTCTGGATT 
TGCTGTGACT 
AAAAGAAGAA 
AGTATATTTG 
ATCTCAGAAG 
TGTATTTTAT 
GTAAAGCTGC 
AAAAAGAACA 



AGTAAGATGT 
TTCAAAAGGC 
TAGGAAACAC 
AGAAGTATTT 
ATAAATAAAT 
AAGAAATCTA 
TTTCAAACAT 
TCAAAAAATG 
TCGGTGAATT 
TCAATTTTCA 
AAAGAATCTT 
TATTTTATTT 
TCTTGAAGTT 
ATAAGCTGTT 
AGATAC 



TTGAAGCTTA 
TGCTGATGTA 
ATACACTTCC 
TACCCACTGG 
GTGTTAGTGC 
GTATTATTTG 
CTGAGGCATG 
TCAAAAAATG 
TTCAATTTAA 
GCAGAAAACA 
AGCACAAACA 
GAAGTGAAGA 
GGCCAACAGA 
AGGTTTTGTT 



TGGAAATTTG 
GTTCCCGGGT 
ATAAATAGCT 
TGGTTTGTGT 
AAGTCATCTT 
TTGAAAATGG 
ATAAATTTAT 
AGCAACAGAG 
GGTATGAAAA 
TGTCAACTTT 
GGACTGTTGT 
ACTTATTTAA 
GTTGTGAATG 
TTAAAAGGAC 



AGTAACAAAC 
TACCTGTATC 
TTAACGTATG 
GTGTATGAAG 
CCCTACCCAT 
TTAGAATAAA 
TATCCATAAT 
GGACCTTATT 
TAAGTTTTTA 
AAAATATAGG 
ACTAGATGTT 
GAATTATTTC 
TGTGTGGAAG 
ATGTTTATTA 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 



20 



25 



Seq ID NO: 8S Protein sequence: 
Protein Accession #: NP_0OS967.1 

1 11 21 31 41 51 

I I I I I I 

MKSVLLLTTL LVPAHLVAAW SNNYAVDCPQ HCDSSBCKSS PRCKRTVLDD CGCCRVCAAG 
RGETCYRTVS GMDGMKCGPG LRCQPSNGED PPGEEFGICK DCPYGTFGMD CRETCNCQ.SG 
ICDRGTGKCL KPPFPQYSVT KSSNRFVSLT EHDMASGDGN IVREBWKEN AAGSPVMRKW 
WBR 



60 
120 
180 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



8eq ID NO: 86 Nucleotide sequence: 
Nucleic Acid Accession #: D86983 

Coding sequence: 52-4491 (underlined sequences correspond to start and stop codons) 



AGCCGGCCGT 
CGCTCCAGGG 
ACGCTGGCCG 
CGCACCACCG 
ACCTCCATCC 
CGGCTGAGGA 
GGAGCATTTG 
TCAATTGACA 
AATCAGATAG 
TTTTTGCATA 
ATGAAGAGAT 
GCGGATTTGC 
TATCCCAGAC 
GAAAGGCCCC 
TACTTCACCT 
AATGAGCTGA 
ATCCAGAACA 
GGAGAGGTGA 
TTTGTAATCC 
AGCGCCACAG 
CCAGTTGACC 
CAGGGGGACA 
ACCGCTTTCA 
GTTATTGAGG 
' ATCGCCTGGA 
TCGGGAACAC 
GCTGTCAACA 
ACCCCAGTGT 
CTCCCGTGCA 
CAGGTGACAG 
GTTGGCCCTG 
TCGGTGAGCA 
GTAGCTACCT 
ACACATTTGT 
CCGAGGGATC 
CAGCTCATTC 
CACTACAACG 
ACCGCCCACC 
CACGACGGCA 
GAGOGCCTGC 
CACCGACTGT 
GGGACGGAGA 



11 
I 

GGTGGCTCCG 
GCCCCGGGCG 
TGGTGGCCCA. 
TGCGCTGCAT 
TAGATCTTCG 
ACTTGAACAC 
AAGACTTGGA 
GGCAAGCATT 
AAACTTTGGA 
ACAACCGGAT 
TGCGACTGGA 
TGAAAACCTA 
GCATCCAGGG 
GGATCACCTC 
GCAGAGCCGA 
GCATGAAGAC 
CACAGGAGAC 
AGACGCAAGA 
AGCCACAGAA 
GCCACCCCCC 
CGCGGGTGAA 
GCGGAGAGTA 
TCATCGTCCA 
GCCAGACCGT 
CCAAGGGAGG 
TTAGAATCTC 
TCATCGGCTC 
TTGCCAGCAT 
GCTCCCAGGG 
AAAGTGGAAA 
CAGACGCAGG 
TGGTGCTCAG 
CCATCGTGGA 
TTGACAGCCG 
CTTACACAGT 
AGGAGCATGT 
ACCTGGTGTC 
GGCGCGTGAA 
CCTGTAACAA 
TGAAATCCGT 
ACAACGGGCA 
CCGTCACACC 



21 
I 

TGOGTCCGAG 
CCGCTGCCTG 
GAAGCCGGGC 
GCATCTGCTG 
CTTTAACAGA 
ATTGCTTCTC 
AAATTTAAAA 
TAAGGGACTT 
CCCAGATTCG 
TACACATTTA 
CTCAAACACA 
CGCGGAGTCG 
ACGCTCAGTG 
CGAGCCCCAG 
AGGCAACCCC 
AGATTCCCGC 
AGACCAGGGT 
GGTGACCCTC 
TACAGAGGTG 
GCCGCGGATC 
CATCACGCCT 
TGCGTGCTCT 
GGCTCTTCCT 
GGATTTCCAG 
GAGCCAGCTC 
TGGTGTTGCC 
CCAGAAGGTC 
TCCCAGCGAC 
CGAGCCCGAG 
ATTTCACATC 
TCGCTATGAG 
TGTGAACGTT 
AGCGATTGCG 
TCCTCGTTCT 
TOAACAGGCA 
ACAGCATGGC 
TCCACAGTAC 
CAACTGCTCG 
CCTGCAGCAC 
GTACGAGAAT 
CGCCCTTCCC 
CGACGAGCAG 



31 
I 

CGTCCGTCCG 
TTGGCGCTCG 
GCAGGGTGTC 
CTGGAGGCCG 
ATCAGAGAGA 
AATAATAATC 
TATCTCTATC 
GCCTCTCTAG 
TTCCAGCATC 
GTTCCAGGGA 
CTTCACTGCG 
GGGAACGCGC 
GCAACCATCA 
GACGCAGATG 
AAGCCTGAGA 
CTAAACTTGC 
ATCTACCAGT 
AGGTACTTCG 
CTGGTTGGGG 
TCCTGGACGA 
TCTGGOGGGC 
GCGACCAACA 
CAGTTCACTG 
TGTGAAGCCA 
TCCGTGGACC 
CTCCACGACC 
GTGGCCCACC 
ACAACAGTGG 
CCAGCCATCA 
AGCCCTGAAG 
TGTGTGGCCC 
CCTGACGTCA 
ACTGTTGACA 
CCAAATGATT 
CGGGCGGGAG 
TTGATGGTCG 
CTGAACCTCA 
GACATGTGCT 
CCCATGTGGG 
GGCTTCAACA 
ATGCCGCGCC 
TTCACCCACA 



41 
I 

CGCCGTCGGC 
TGCTGTTCTG 
CGAGCCGCTG 
TGCCCGCCGT 
TCCAACCTGG 
AGATCAAGAG 
TGTACAAGAA 
AGCAACTATA 
TCCCGAAGCT 
CATTTAATCA 
ACTGTGAAAT 
AGGCAGCGGC 
CCCCGGAAGA 
TGACCTCGGG 
TCATCTGGCT 
TGGACGATGG 
GCATGGCAAA 
GGTCTCCAGC 
AGAGCGTCAC 
GAGGTGACCG 
TTTACATACA 
ACATTGACAG 
TGACGCCTCA 
AGGGCAACCC 
GGCGGCACCT 
AGGGCCAGTA 
TGACTGTGCA 
AGGTGGGCGC 
CCTGGAACAA 
GATTCTTGAC 
GGAACACCAT 
GTCGAAATGG 
GAGCTATAAA 
TGCTGGCCTT 
AAATCTTTGA 
ACCTCAACGG 
TCGCAAACCT 
TCCACCAGAA 
GCGCCTCGCT 
CCCCTCGGGG 
TGGTGTCCAC 
TGCTGATGCA 



51 

I 

CATGGCCAAG 
CGCCTGGGGG 
CCTGTGCTTC 
GGCGCCGCAG 
GGCATTCAGG 
GATACCTAGT 
TGAGATCCAG 
CCTGCACTTT 
CGAGAGGCTA 
CTTGGAATCr 
CCTGTGGTTG 
CATCTGTGAA 
GCTGAACTGT 
GAACACCGTG 
GCGAAACAAT 
GACCCTGATG 
GAACGTGGCC 
TCGACCCACT 
GCTGGAGTGC 
CACACCCTTG 
GAACGTCGTA 
CGTCCATGCC 
GGACAGAGTC 
GCCGCCCGTC 
GGTCCTGTCA 
OGAATGCCAG 
GCCCAGAGTC 
CAATGTGCAG 
GGATGGGGTT 
CATCAATGAC 
TGGGTCGGCC 
AGATCOGTTT 
CTCAACCCGA 
GTTCCGGTAT 
ACGGACATTG 
AACAAGTTAC 
GTCGGGCTGT 
GTACCGGACG 
GACCGCCTTC 
CATCAACCCC 
CACCCTGATC 
GTGGGGCCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
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TTCCTGGACC ACGACCTCGA CTCCACGGTG GTGGCCCTGA GCCAGGCACG CTTCTCCGAC 2580 

GGACAGCACT GCAGCAACGT GTGCAGCAAC GACCCCCCCT GCTTCTCTGT CATGATCCCC 2640 

CCCAATGACT CCCGGGCCAG GAGCGGGGCC CGCTGCATGT TCTTCGTGCG CTCCAGCCCT 2700 

GTGTGCGGCA GOGGCATGAC TTCGCTGCTC ATGAACTCCG TGTACCCGCG GGAGCAGATC 2760 

5 AACCAGCTCA CCTCCTACAT CGACGCATCC AACGTGTACG GGAGCACGGA GCATGAGGCC 2820 

CGCAGCATCC GCGACCTGGC CAGCCACCGC GGCCTGCTGC GGCAGGGCAT CGTGCAGCGG 2880 

TCCGGGAAGC CGCTGCTCCC CTTCGCCACC GGGCCGCCCA CGGAGTGCAT GCGGGACGAG 2940 

AACGAGAGCC CCATCCCCTG CTTCCTGGCC GGGGACCACC GCGCCAACGA GCAGCTGGGC 3000 

CTGACCAGCA TGCACACGCT GTGGTTCCGC GAGCACAACC GCATTGCCAC GGAGCTGCTC 3060 

10 AAGCTGAACC CGCACTGGGA CGGCGACACC ATCTACTATG AGACCAGGAA GATCGTGGGT 3120 

GCGGAGATCC AGCACATCAC CTACCAGCAC TGGCTCCCGA AGATCCTGGG GGAGGTGGGC 3180 

ATGAGGACGC TGGGAGAGTA CCACGGCTAC GACCCCGGCA TCAATCCTGG CATCTTCAAC 3240 

GCCTTCGCCA CCGCGGCCTT CAGGTTTGGC CACACGCTTG TCAACCCACT GCTTTACCGG 3300 

CTGGACGAGA ACTTCCAGCC CATTGCACAA GATCACCTCC CCCTTCACAA AGCTTTCTTC 3360 

15 TCTCCCTTCC GGATTGTGAA TGAGGGCGGC ATCGATCCGC TTCTCAGGGG GCTGTTCGGG 3420 

GTGGCGGGGA AAATGCGTGT GCCCTCGCAG CTGCTGAACA CGGAGCTCAC GGAGCGGCTG 3480 

TTCTCCATGG CACACACGGT GGCTCTGGAC CTGGCGGCCA TCAACATCCA GCGGGGCCGG 3540 

GACCACGGGA TCCCACCCTA CCACGACTAC AGGGTCTACT GCAATCTATC GGCGGCACAC 3600 

ACGTTCGAGG ACCTGAAAAA TGAGATTAAA AACCCTGAGA TCCGGGAGAA ACTGAAAAGG 3660 

20 TTGTATGGCT CGACACTCAA CATCGACCTG TTTCCGGCGC TCGTGGTGGA GGACCTGGTG 3720 

CCTGGCAGCC GGCTGGGCCC CACCCTGATG TGTCTTCTCA GCACACAGTT CAAGCGCCTG 3780 

CGAGATGGGG ACAGGTTGTG GTATGAGAAC CCTGGGGTGT TCTCCCCGGC CCAGCTGACT 3840 

CAGATCAAGC AGACGTCGCT GGCCAGGATC CTATGCGACA ACGCGGACAA CATCACCCGG 3900 

GTGCAGAGCG ACGTGTTCAG GGTGGCGGAG TTCCCTCACG GCTACGGCAG CTGTGACGAG 3960 

25 ATCCCCAGGG TGGACCTCCG GGTGTGGCAG GACTGCTGTG AAGACTGTAG GACCAGGGGG 4020 

CAGTTCAA7G CCTTTTCCTA TCATTTCCGA GGCAGACGGT CTCTTGAGTT CAGCTACCAG 4080 

GAGGACAAGC CGACCAAGAA AACAAGACCA CGGAAAATAC CCAGTGTTGG GAGACAGGGG 4140 

GAACATCTCA GCAACAGCAC CTCAGCCTTC AGCACACGCT CAGATGCATC TGGGACAAAT 4200 

GACTTCAGAG AGTTTGTTCT GGAAATGCAG AAGACCATCA CAGACCTCAG AACACAGATA 4260 

30 AAGAAACTTG AATCACGGCT CAGTACCACA GAGTGCGTGG ATGCCGGGGG CGAATCTCAC 4320 

GCCAACAACA CCAAGTGGAA AAAAGATGCA TGCACCATTT GTGAATGCAA AGACGGGCAG 4380 

GTCACCTGCT TCGTGGAAGC TTGCCCCCCT GCCACCTGTG CTGTCCCCGT GAACATCCCA 4440 

GGGGCCTGCT GTCCAGTCTG CTTACAGAAG AGGGCGGAGG AAAAGCC CTA GG CTCCTGGG 4500 

AGGCTCCTCA GAGTTTGTCT GCTGTGCCAT CGTGAGATCG GGTGGCCGAT GGCAGGGAGC 4560 

35 TGCGGACTGC AGACCAGGAA ACACCCAGAA CTCGTGACAT TTCATGACAA CGTCCAGCTG 4620 

GTGCTGTTAC AGAAGGCAGT GCAGGAGGCT TCCAACCAGA GCATCTGCGG AGAAGGAGGC 4680 

ACAGCAGGTG CCTGAAGGGA AGCAGGCAGG AGTCCTAGCT TCACGTTAGA CTTCTCAGGT 4740 

TTTTATTTAA TTCTTTTAAA ATGAAAAATT GGTGCTACTA TTAAATTGCA CAGTTGAATC 4800 

ATTTAGGCGC CTAAATTGGT TTTGCCTCCC AACACCATTT CTTTTTAAAT AAAGCAGGAT 4860 

40 ACCTCTATAT GTCAGCCTTG CCTTGTTCAG ATGCCAGGAG CCGGCAGACC TGTCACCCGC 4920 

AGGTGGGGTG AGTCTCGGAG CTGCCAGAGG GGCTCACCGA AATCGGGGTT CCATCACAAG 4980 

CTATGTTTAA AAAGAAAATT GGTGTTTGGC AAACGGAACA GAACCTTTGA TGAGAGCGTT 5040 

CACAGGGACA CTGTCTGGGG GTCCAGTGCA AGCCCCCGGC CTCTTCCCTG GGAACCTCTG 5100 

AACTCCTCCT TCCTCTGGGC TCTCTGTAAC ATTTCACCAC ACGTCAGCAT CTAATCCCAA 5160 

45 GACAAACATT CCCGCTGCTC GAAGCAGCTG TATAGCCTGT GACTCTCCGT GTGTCAGCTC 5220 

CTTCCACACC TGATTAGAAC ATTCATAAGC CACATTTAGA AACAGATTTG CTTTCAGCTG 5280 

TCACTTGCAC ACATACTGCC TAGTTGTGAA CCAAATGTGA AAAAACCTCC TTCATCCCAT 5340 

TGTGTATCTG ATACCTGCCG AGGGCCAAGG GTGTGTGTTG ACAACGCCGC TCCCAGCCGG 5400 

CCCTGGTTGC GTCCACGTCC TGAACAAGAG CCGCTTCCGG ATGGCTCTTC CCAAGGGAGG 5460 

50 AGGAGCTCAA GTGTCGGGAA CTGTCTAACT TCAGGTTGTG TGAGTGCGTT 

Seq ID NO: 87 Protein sequence: 
Protein Accession #: BAA13219 

1 11 21 31 41 51 

55 | | | | | | 

SRPWWLRASE RPSAPSAMAK RSRGPGRRCIi LALVLPCAWG TLAWAQKPG AGCPSRCLCF 60 

RTTVRCMHLL LEAVPAVAPQ TSILDLRFNR IREIQPGAFR RLRNLNTLLL NNNQIKRIPS 120 

GAFEDLENIiK YLYLYKNEIQ SIDRQAFKGL ASLSQLYLHF NQIETLDPDS FQHLPKLERL 180 

FLHNNRITHL VPGTFNHLES MKRLRLDSNT LHCDCEILWL ADLLKTYAES GNAQAAAICB 240 

60 YPRRIQGRSV ATITPEELNC ERPRITSEPQ DADVTSGNTV YPTCRAEGNP KPEIIWLRNN 300 

NELSMKTDSR LNIiLDDGTLM IQNTQETDQG IYQCMAKNVA GEVKTQEVTL RYPGSPARPT 360 

FVIQPQNTEV LVGESVTLEC SATGHPPPRI SWTRGDRTPL PVDPRVNITP SGGL YIQNW 420 

QGDSGEYACS ATNNIDSVHA TAFIIVQALP QFTVTPQDRV VIEGQTVDFQ CEAKGNPPPV 480 

IAWTKGGSQL SVDRRHLVLS SGTLRISGVA LHDQGQYECQ AVNIIGSQKV VAHLTVQPRV 540 

65 TPVFASIPSD TTVEVGANVQ LPCSSQGEPE PAITWNKDGV QVTESGKFHI SPEGPLTIND 600 

VGPADAGRYE CVARNTIGSA SVSMVLSVNV PDVSRNGDPP VATSIVEAIA TVDRAIHSTR 660 

THLFDSRPRS PNDLLALFRY PRDPYTVEQA RAGEIPERTL QLIQEHVQHG LMVDLNGTSY 720 

HYNDLVSPQY UILIANLSGC TAHRRVNNCS DMCFHQKYRT HDGTCNNLQH PMWGASLTAF 780 

ERLLKSVYEN GFNTPRGINP HRLYNGHALP MPRLVSTTLI GTETVTPDEQ FTHMLMQWGQ 840 

70 FLDHDLDSTV VALSQARFSD GQHCSNVCSN DPPCFSVMIP PNDSRARSGA RCMFFVRSSP 900 

VCGSGMTSLL MNSVYPREQI NQLTSYIDAS NVYGSTEHEA RSIRDLASHR GLLRQGIVQR 960 

SGKPLLPFAT GPPTECMRDE NESPIPCFLA GDHRANEQLG LTSMHTLWFR EHNRIATELL 1020 

KLNPHWDGDT IYYETRKTVG AEIQHITYQH WLPKILGEVG MRTLGEYHGY DPGINAGIFN 1080 

AFATAAFRFG HTLVNPLLYR LDENFQPIAQ DHLPLHKAFF SPFRIVNEGG IDPLLRGLFG 1140 

75 VAGKMRVPSQ LLNTELTERL FSMAHTVALD LAAINIQRGR DHGIPPYHDY RVYQJLSAAH 1200 

TFBDLKNEIK NPEIREKLKR LYGSTLNIDL FPALWEDLV PGSRUGPTLM CLLSTQFKRL 1260 
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RDGDRLWYSN PGVFSPAQIiT QIKQTSIARI LCDNADNITR VQSDVFRVAE FPHGYGSCDE 1320 

IPRVDLRVWQ DCCEDCRTRG QFNAFSYHFR GRRSLEFSYQ EDKPTKKTRP RKIPSVGRQG 1380 

EHLSNSTSAF STRSDASGTN DFHEFVLEMQ KTITDLRTQI KKLESRLSTT ECVDAGGBSH 1440 
ANNTKWKKDA CTICECKDGQ VTCFVEACPP ATCAVPVNIP GACCPVCLQK RAEEKP 



Seq ID NO: 88 DNA sequence 

Nucleic Acid Accession #: NM_004 834.1 

Coding sequence: 80-3577 (underlined sequences correspond to start and stop codoas) 



1 11 21 31 41 51 

I I I I I I 

AATTCGAGGA TCCGGGTACC ATGGCACAGA GCGACAGAGA CATTTATTGT TATTTGTTTT 60 

TTGGTGGCAA AAAGGGAA AA TGG CQAACGA CTCCCCTGCA AAAAGTCTGG TGGACATCGA 120 

CCTCTCCTCC CTGCGGGATC CTGCTGGGAT TTTTGAGCTG GTGGAAGTGG TTGGAAATGG 180 

CACCTATGGA CAAGTCTATA AGGGTCGACA TGTTAAAACG GGTCAGTTGG CAGCCATCAA 240 

AGTTATGGAT GTCACTGAGG ATGAAGAGGA AGAAATCAAA CTGGAGATAA ATATGCTAAA 300 

GAAATACTCT CATCACAGAA ACATTGCAAC ATATTATGGT GCTTTCATCA AAAAGAGCCC 360 

TCCAGGACAT GATGACCAAC TCTGGCTTGT TATGGAGTTC TGTGGGGCTG GGTCCATTAC 420 

AGACCTTGTG AAGAACACCA AAGGGAACAC ACTCAAAGAA GACTGGATCG CTTACATCTC 480 

CAGAGAAATC CTGAGGGGAC TGGCACATCT TCACATTCAT CATGTGATTC ACCGGGATAT 540 

CAAGGGCCAG AATGTGTTGC TGACTGAGAA TGCAGAGGTG AAACTTGTTG ACTTTGGTGT 60 0 

GAGTGCTCAG CTGGACAGGA CTGTGGGGCG GAGAAATACG TTCATAGGCA CTCCCTACTG 660 

GATGGCTCCT GAGGTCATCG CCTGTGATGA GAACCCAGAT GCCACCTATG ATTACAGAAG 720 

TGATCTTTGG TCTTGTGGCA TTACAGCCAT TGAGATGGCA GAAGGTGCTC CCCCTCTCTG 780 

TGACATGCAT CCAATGAGAG CACTGTTTCT CATTCCCAGA AACCCTCCTC CCCGGCTGAA 840 

GTCAAAAAAA TGGTCGAAGA AGTTTTTTAG TTTTATAGAA GGGTGCCTGG TGAAGAATTA 900 

CATGCAGCGG CCCTCTACAG AGCAGCTTTT GAAACATCCT TTTATAAGGG ATCAGCCAAA 960 

TGAAAGGCAA GTTAGAATCC AGCTTAAGGA TCATATAGAT CGTACCAGGA AGAAGAGAGG 1020 

CGAGAAAGAT GAAACTGAGT ATGAGTACAG TGGGAGTGAG GAAGAAGAGG AGGAAGTGCC 1080 

TGAACAGGAA GGAGAGCCAA GTTCCATTGT GAACGTGCCT GGTGAGTCTA CTCTTCGCCG 1140 

AGATTTCCTG AGACTGCAGC AGGAGAACAA GGAACGTTCC GAGGCTCTTC GGAGACAACA 1200 

GTTACTACAG GAGCAACAGC TCCGGGAGCA GGAAGAATAT AAAAGGCAAC TGCTGGCAGA 1260 

GAGACAGAAG CGGATTGAGC AGCAGAAAGA ACAGAGGCGA CGGCTAGAAG AGCAACAAAG 1320 

GAGAGAGCGG GAGGCTAGAA GGCAGCAGGA ACGTGAACAG CGAAGGAGAG AACAAGAAGA 1380 

AAAGAGGCGT CTAGAGGAGT TGGAGAGAAG GCGCAAAGAA GAAGAGGAGA GGAGACGGGC 1440 

AGAAGAAGAA AAGAGGAGAG TTGAAAGAGA ACAGGAGTAT ATCAGGCGAC AGCTAGAAGA 1500 

GGAGCAGCGG CACTTGGAAG TCCTTCAGCA GCAGCTGCTC CAGGAGCAGG CCATGTTACT 1560 

GCATGACCAT AGGAGGCCGC ACCCGCAGCA CTCGCAGCAG CCGCCACCAC GGCAGCAGGA 1620 

AAGGAGCAAG CCAAGCTTCC ATGCTCCCGA GCCCAAAGCC CACTACGAGC CTGCTGACCG 1680 

AGCGCGAGAG GTTCCTGTGA GAACAACATC TCGCTCCCCT GTTCTGTCCC GTCGAGATTC 1740 

CCCACTGCAG GGCAGTGGGC AGCAGAATAG CCAGGCAGGA CAGAGAAACT CCACCAGTAT 1800 

TGAGCCCAGG CTTCTGTGGG AGAGAGTGGA GAAGCTGGTG CCCAGACCTG GCAGTGGCAG 1860 

CTCCTCAGGG TCCAGCAACT CAGGATCCCA GCCCGGGTCT CACCCTGGGT CTCAGAGTGG 1920 

CTCCGGGGAA CGCTTCAGAG TGAGATCATC ATCCAAGTCT GAAGGCTCTC CATCTCAGCG 1980 

CCTGGAAAAT GCAGTGAAAA AACCTGAAGA TAAAAAGGAA GTTTTCAGAC CCCTCAAGCC 2040 

TGCTGGCGAA GTGGATCTGA CCGCACTGGC CAAAGAGCTT CGAGCAGTGG AAGATGTACG 2100 

GCCACCTCAC AAAGTAACGG ACTACTCCTC ATCCAGTGAG GAGTCGGGGA CGACGGATGA 2160 

GGAGGACGAC GATGTGGAGC AGGAAGGGGC TGAOGAGTCC ACCTCAGGAC CAGAGGACAC 2220 

CAGAGCAGCG TCATCTCTGA ATTTGAGCAA TGGTGAAACG GAATCTGTGA AAACCATGAT 2280 

TGTCCATGAT GATGTAGAAA GTGAGCCGGC CATGACCCCA TCCAAGGAGG GCACTCTAAT 2340 

CGTCCGCCAG ACTCAGTCCG CTAGTAGCAC ACTCCAGAAA CACAAATCTT CCTCCTCCTT 2400 

TACACCTTTT ATAGACCCCA GATTACTACA GATTTCTCCA TCTAGCGGAA CAACAGTGAC 2460 

ATCTGTGGTG GGATTTTCCT GTGATGGGAT GAGACCAGAA GCCATAAGGC AAGATCCTAC 2520 

CCGGAAAGGC TCAGTGGTCA ATGTGAATCC TACCAACACT AGGCCACAGA GTGACACCCC 2580 

GGAGATTCGT AAATACAAGA AGAGGTTTAA CTCTGAGATT CTGTGTGCTG CCTTATGGGG 2640 

AGTGAATTTG CTAGTGGGTA CAGAGAGTGG CCTGATGCTG CTGGACAGAA GTGGCCAAGG 270 0 

GAAGGTCTAT CCTCTTATCA ACCGAAGACG ATTTCAACAA ATGGACGTAC TTGAGGGCTT 2760 

GAATGTCTTG GTGACAATAT CTGGCAAAAA GGATAAGTTA CGTGTCTACT ATTTGTCCTG 2820 

GTTAAGAAAT AAAATACTTC ACAATGATCC AGAAGTTGAG AAGAAGCAGG GATGGACAAC 2880 

CGTAGGGGAT TTGGAAGGAT GTGTACATTA TAAAGTTGTA AAATATGAAA GAATCAAATT 2940 

TCTGGTGATT GCTTTGAAGA GTTCTGTGGA AGTCTATGCG TGGGCACCAA AGCCATATCA 3000 

CAAATTTATG GCCTTTAAGT CATTTGGAGA ATTGGTACAT AAGCCATTAC TGGTGGATCT 3060 

CACTGTTGAG GAAGGCCAGA GGTTGAAAGT GATCTATGGA TCCTGTGCTG GATTCCATGC 3120 

TGTTGATGTG GATTCAGGAT CAGTCTATGA CATTTATCTA CCAACACATG TAAGAAAGAA 3180 

CCCACACTCT ATGATCCAGT GTAGCATCAA ACCCCATGCA ATCATCATCC TCCCCAATAC 3240 

AGATGGAATG GAGCTTCTGG TGTGCTATGA AGATGAGGGG GTTTATGTAA ACACATATGG 3300 

AAGGATCACC AAGGATGTAG TTCTACAGTG GGGAGAGATG CCTACATCAG TAGCATATAT 3360 

TCGATCCAAT CAGACAATGG GCTGGGGAGA GAAGGCCATA GAGATCOGAT CTGTGGAAAC 3420 

TGGTCACTTG GATGGTGTGT TCATGCACAA AAGGGCTCAA AGACTAAAAT TCTTGTGTGA 3480 

ACGCAATGAC AAGGTGTTCT TTGCCTCTGT TCGGTCTGGT GGCAGCAGTC AGGTTTATTT 3540 

CATGACCTTA GGCAGGACTT CTCTTCTGAG CTGGTAGAAG CAGTGTGATC CAGGGATTAC 3600 

TGGCCTCCAG AGTCTTCAAG ATCCTGAGAA CTTGGAATTC CTTGTAACTG GAGCTCGGAG 3660 

CTGCACCGAG GGCAACCAGG ACAGCTGTGT GTGCAGACCT CATGTGTTGG GTTCTCTCCC 3720 

CTCCTTCCTG TTCCTCTTAT ATACCAGTTT ATCCCCATTC TTTTTTTTTT TCTTACTCCA 3780 
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AAATAAATCA AGGCTGCAAT GCAGCTGGTG CTGTTCA6AT TCCAAAAAAA AAAAAAAACC 3840 
ATGGTACCCG GATCCTCGAA TTCC 



Seq ID No: 89 Protein sequence: 
Protein Accession #: NP_004 825.1 



1 11 21 31 41 51 

I I I I I I 

MANDSPAKSL VDIDLSSLED PAGIPELVEV VGNGTYGQVY KGRHVKTGQL AAIKVMDVTS 60 

DEBEEIKLBI NMLKKYSHHR NIATYYGAFI KKSPPGHDDQ LWLVMEPCGA GSITDLVKNT 120 

KGNTLKEDWI AYISREILRG LAHLHIHHVI HRDIKGQNVL LTENAEVKLV DFGVSAQLDR 180 

TVGRRNTFIG TPYWMAPEVI ACDENPDATY DYRSDLWSCG ITAIEMAEGA PPLCDMHPMR 240 

ALFLIPRNPP PRLKSKKWSK KFFSFIEGCL VKNYMQRPST EQLLKHPFIR DQPNERQVRI 300 

QLKDHIDRTR KKRGEKDETE YEYSGSEEEE EEVPEQEGEP SSIVNVPGBS TLRRDFLRLQ 360 

QENKERSEAL RRQQLXiQEQQ LREQBEYKRQ LLAERQKRIE QQKEQRRRXiE EQQRREREAR 420 

RQQEREQRRR EQEEKRRTiKE LERRRKEEEE RRRAEEEKRR VEREQEYIRR QLEEEQRKLE 480 

VLQQQLLQEQ AMLLHDHRRP HPQHSQQPPP PQQERSKPSP HAPEPKAHYE PADRAREVPV 540 

RTTSRSPVLS RRDSPLQGSG QQNSQAGQRN STSIEPRLLW ERVEKLVPRP GSGSSSGSSN 600 

SGSQPGSHPG SQSGSGERFR VRSSSKSEGS PSQRLENAVK KPEDKKEVFR PLKPAGEVDL 660 

TALAKELRAV EDVRPPHKVT DYSSSSEESG TTDEEDDDVE QEGADESTSG PEDTRAASSL 720 

NLSNGETESV KTMIVHDDVE SEPAMTPSKE GTLIVRQTQS ASSTLQKHKS SSSFTPFIDP 780 

RLLQISPSSG TTVTSWGFS CDGMRPEAIR QDPTRKGSW NVNPTNTRPQ SDTPEIRKYK 840 

KRFNSEILCA AliWGVNLLVG TESGLMLLDR SGOGKVYPLI NRRRFQQMDV LEGLNVLVTI 900 

SGKKDKLRVY YLSWLRNKIL HNDPEVEKKQ GWTTVGDLEG CVHYKWKYE RIKFLVIALK 960 

SSVEVYAWAP KPYHKFMAFK SFGELVHKPL LVDLTVEEGQ RLKVTYGSCA GFHAVDVDSG 1020 

SVYDIYLPTH VRKNPHSMIQ CSIKPHAIII LPNTDGMELL VCYEDEGVYV NTYGRITKDV 1080 

VLQWGEMPTS VAYIRSNQTM GWGEKAIEIR SVETGHLDGV FMHKRAQRLK FLCERNDKVF 1140 
FASVRSGGSS QVYFMTLGRT SLLSW 



Seq ID NO: 90 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 2-71 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

TTACACTTCA ATTCCTTACA CGGTATTTCA AACAAACAGT TTTGCTGAGA GGAGCTTTTG 60 

TCTCTCC TTA AG AAAATGTT TATAAAGCTG AAAGGAAATC AAACAGTAAT CTTAAAAATG 120 

AAAACAAAAC AACCCAACAA CCTAGATAAC TACAGTGATC AGGGAGCACA GTTCAACTCC 180 

TTGTTATGTT TTAGTCATAT GGCCTACTCA AACAGCTAAA TAACAACACC AGTGGCAGAT 240 

AAAAATCACC ATTTATCTTT CAGCTATTAA TCTTTTGAAT GAATAAACTG TGACAAACAA 300 

ATTAACATTT TTGAACATGA AAGGCAACTT CTGCACAATC CTGTATCCAA GCAAACTTTA 360 

AATTATCCAC TTAATTATTA CTTAATCTTA AAAAAAATTA GAACCCAGAA CTTTTCAATG 420 

AAGCATTTGA AAGTTGAAGT GGAATTTAGG AAAGCCATAA AAATATAAAT ACTGTTATCA 480 

CAGCACCAGC AAGCCATAAT CTTTATACCT ATCAGTTCTA TTTCTATTAA CAGTAAAAAC 540 

ATTAAGCAAG ATATAAGACT ACCTGCCCAA GAATTCAGTC TTTTTTCATT TTTGTTTTTC 600 

TCAGTTCTGA GGATGTTAAT CGTCAAATTT TCTTTGGACT GCATTCCTCA CTACTTTTTG 660 

CACAATGGTC TCACGTTCTC ACATTTGTTC TCGCGAATAA ATTGATAAAA GGTGTTAAGT 720 

TCTGTGAATG TCTTTTTAAT TATGGGCATA ATTGTOCTTG ACTGGATAAA AACTTAAGTC 780 

CACCCTTATG TTTATAATAA TTTCTTGAGA ACAGCAAACT GCATTTACCA TCGTAAAACA 840 

ACATCTGACT TACGGGAGCT GCAGGGAAGT GGTGAGACAG TTCGAACGGC TCCTCAGAAA 900 

TCCAGTGACC CAATTCTAAA GACCATAGCA CCTGCAAGTG ACACAACAAG CAGATTTATT 960 

ATACATTTAT TAGCCTTAGC AGGCAATAAA CCAAGAATCA CTTTGAAGAC ACAGCAAAAA 1020 

GTGATACACT CCGCAGATCT GAAATAGATG TGTTCTCAGA CAACAAAGTC CCTTCAGAAT 1080 
CTTCATGTTG CATAAATGTT ATGAATATTA ATAAAAAGTT GATTGAGA 



Seq ID No: 91 Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

YTSIPYTVFQ TNSFAERSFC LSL 



Seq ID NO: 92 DNA sequence 

Nucleic Acid Accession #: NM_003706.1 

Coding sequence: 310-1935 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CACGAGGCAG GGGCCATTTT ACCTCCAGGT TGGCCCTGCT CAGGACCAGG AGGAAACACC 60 
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TCCAGCCOGC GACCTCCTCC CACAGGGGGA AAAGGAAAGC AGGAGGACCA CAGAAGCTTT 120 

GGCACCGAGG ATCCCCGCAG TCTTCACCCG CGGAGATTCC GGCTGAAGGA GCTGTCCAGC 180 

GACTACACCG CTAAGCGCAG GGAGCCCAAG CCTCCGCACC GGATTCCGGA GCACAAGCTC 240 

CACCGCGCAT GCGCACACGC CCCAGACCCA GGCTCAGGAG GACTGAGAAT TTTCTGACCG 300 

CAGTGCAC CA TGGGAAGCTC TGAAGTTTCC ATAATTCCTG GGCTCCAGAA AGAAGAAAAG 360 

GCGGCCGTGG AGAGACGAAG ACTTCATGTG CTGAAAGCTC TGAAGAAGCT AAGGATTGAG 420 

GCTGATGAGG CCCCAGTTGT TGCTGTGCTG GGCTCAGGCG GAGGACTGCG GGCTCACATT 480 

GCCTGCCTTG GGGTCCTGAG TGAGATGAAA GAACAGGGCC TGTTGGATGC CGTCACGTAC 540 

CTCGCAGGGG TCTCTGGATC CACTTGGGCA ATATCTTCTC TCTACACCAA TGATGGTGAC 600 

ATGGAAGCTC TCGAGGCTGA CCTGAAACAT CGATTTACCC GACAGGAGTG GGACTTGGCT 660 

AAGAGCCTAC AGAAAACCAT CCAAGCAGCG AGGTCTGAGA ATTACTCTCT GACCGACTTC 720 

TGGGCCTACA TGGTTATCTC TAAGCAAACC AGAGAACTGC CGGAGTCTCA TTTGTCCAAT 780 

ATGAAGAAGC CCGTGGAAGA AGGGACACTA CCCTACCCAA TATTTGCAGC CATTGACAAT 840 

GACCTGCAAC CTTCCTGGCA GGAGGCAAGA GCACCAGAGA CCTGGTTCGA GTTCACCCCT 900 

CACCACGCTG GCTTCTCTGC ACTGGGGGCC TTTGTTTCCA TAACCCACTT CGGAAGCAAA 960 

TTCAAGAAGG GAAGACTGGT CAGAACTCAC CCTGAGAGAG ACCTGACTTT CCTGAGAGGT 1020 

TTATGGGGAA GTGCTCTTGG TAACACTGAA GTCATTAGGG AATACATTTT TGACCAGTTA 1080 

AGGAATCTGA CCCTGAAAGG TTTATGGAGA AGGGCTGTTG CTAATGCTAA AAGCATTGGA 1140 

CACCTTATTT TTGCCCGATT ACTGAGGCTG CAAGAAAGTT CACAAGGGGA ACATCCTCCC 1200 

CCAGAAGATG AAGGCGGTGA GCCTGAACAC ACCTGGCTGA CTGAGATGCT CGAGAATTGG 1260 

ACCAGGACCT CCCTGGAAAA GCAGGAGCAG CCCCATGAGG ACCCCGAAAG GAAAGGCTCA 1320 

CTCAGTAACT TGATGGATTT TGTGAAGAAA ACAGGCATTT GCGCTTCAAA GTGGGAATGG 1380 

GGGACCACTC ACAACTTCCT GTACAAACAC GGTGGCATCC GGGACAAGAT AATGAGCAGC 1440 

CGGAAGCACC TCCACCTGGT GGATGCTGGT TTAGCCATCA ACACTCCCTT CCCACTCGTG - 1500 

CTGCCCCCGA CGCGGGAGGT TCACCTCATC CTCTCCTTCG ACTTCAGTGC CGGAGATCCT 1560 

TTCGAGACCA TCCGGGCTAC CACTGACTAC TGCCGCCGCC ACAAGATCCC CTTTCCCCAA 1620 

GTAGAAGAGG CTGAGCTGGA TTTGTGGTCC AAGGCCCCCG CCAGCTGCTA CATCCTGAAA 1680 

GGAGAAACTG GACCAGTGGT GATACATTTT CCCCTGTTCA ACATAGATGC CTGTGGAGGT 1740 

GATATTGAGG CATGGAGTGA CACATACGAC ACATTCAAGC TTGCTGACAC CTACACTCTA 1800 

GATGTGGTGG TGCTACTCTT GGCATTAGCC AAGAAGAATG TCAGGGAAAA CAAGAAGAAG 1860 

ATCCTTAGAG AGTTGATGAA CGTGGCCGGG CTCTACTACC CGAAGGATAG TGCCCGAAGT 1920 

TGCTGCTTGG CATAGATGAG CCTCAGCTTC CAGGGCACTG TGGGCCTGTT GGTCTACTAG 1980 

GGCCCTGAAG TCCACCTGGC CTTCCTGTTC TTCACTCCCT TCAGCCACAC GCTTCATGGC 2040 

CTTGAGTTCA CCTTGGCTGT CCTAACAGGG CCAATCACCA GTGACCAGCT AGACTGTGAT 2100 

TTTGATAGCG TCATTCAGAA GAAGGTGTCC AAGGAGCTGA AGGTGGTGAA ATTTGTCCTG 2160 

CAGGTCCCTC GGGAGATCCT GGAGCTGGAG CATGAGTGTC TGACAATCAG AAGCATCATG 2220 

TCCAATGTCC AGATGGCCAG AATGAATGTG ATAGTTCAGA CCAATGCCTT CCACTGCTCC 2280 

TTTATGACTG CACTTCTAGC CAGTAGCTCT GCACAAGTTA GCTCTGTAGA AGTAAGAACT 2340 

TGGGCTTAAA TCATGGGCTA TCTCTCCACA GCCAAGTGGA GCTCTGAGAA TACAACAAGT 2400 

GCTCAATAAA TGCTTGCTGA TTGACTGATG AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2460 
AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAA 



Seq ID No: 93 Protein sequence: 
Protein Accession #: NP_003697.1 

1 11 21 31 41 51 

I I I j I I 

MGSSEVSIIP GLQKEEKAAV ERRRLHVLKA LKKLRIBADE APWAVLGSG GGLRAHIACL 60 
GVLSEMKEQG LLDAVTYLAG VSGSTWAISS LYTNDGDMEA LEADIiKHRFT RQEWDLAKSL 120 
QKTIQAARSE NYSLTDFWAY MVISKQTREL PESHLSNMKK PVEEGTLPYP IFAAIDNDLQ 180 
PSWQEARAPE TWFEFTPHHA GFSAIjGAFVS ITHFGSKFKK GRLVRTHPER DLTFLRGLWG 240 
SALGNTEVIR EYIFDQLRNL TLKGLWRRAV ANAKSIGHLI FARLLRLQBS SQGEHPPPED 300 
EGGEPEHTWL TEMLENWTRT SLEKQEQPHE DPERKGSLSN LMDFVKKTGI CASKWEWGTT 360 
HNFLYKHGGI RDKIMSSRKH LHLVDAG1AI NTPFPLVLPP TREVHLILSF DFSAGDPFET 420 
IRATTDYCRR HKIPFPQVEE AELDLWSKAP ASCYILKGET GPWXHFPLF NIDACGGDIE 480 
AWSDTYDTFK LADTYTLDW VLIilALAKKN VRENKKKILR BLMNVAGLYY PKDSARSCCL . 540 
A 

Seq ID NO: 94 DNA sequence 
Nucleic Acid Accession #: AK027351 

Coding sequence: 1-642 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 . 51 

III III 

AGGGAAAAAA ACTCCATTAA AAAGCCCAGC TTTCCTCCAT GTTAGATGTG ACTTGGAAAA 60 

TGAGAAAGAT TTAGCAAAAT TCCACCGTAT CTTTTGCCAG GCTAGAGACA GGGAGAGCAG 120 

AGTAAAACCC TCAGGCTGCT GAAATTTCTA GGCTGTTAGG AAGCCCCTCG AATTCTGTGA .180 

AAATGAGGGT TTCTTAACTC ACACTGAGAG CGGAAAGGGG CAGACCCTTT TCATAACTCC 240 

CTCAAGTGTG TGTTACCTTT CTTTACCAGC ATGGTAAGCA ACAGGACATA TCCCAGCCTC 300 

GGACATGTCT GTATGATCCA AGGTACCCAA AGTCAGACAG AGTAAACTCA AGCCTGGCAC 360 

TGGCTTTCTG CCGCTTCATG TGCTTTGGAA AAAGCAGGAG AAGCAATAGC AGCAGGAGTC 420 

CCCAGCAGCT GGAGCCGCAA GAATGAACTG CAAAGAGGGA ACTGACAGCA GCTGCGGCTG 480 

CAGGGGCAAC GAOGAGAAGA AGATGTTGAA GTGTGTGGTG GTGGGGGACG GTGCCGTGGG 540 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



GAAAACCTGC 
TGTGTTTGAC 
GTATGACACC 
GGATGTGTTT 
GGAATGGGTC 
CCAGATTGAT 
ACCTCTCACT 
GGAATGTTCA 
CATTTTCCAC 
TATCTGAGGT 
TCTGTGGCCA 
GAGGCTTGCC 
ACGACCTCCC 
GATTGCAGAC 
TTACAAATCC 
AAGAAACTTG 
CAATACTAAC 
TTATTTCAAG 
AGCCTAAAGA 
TTAGAACCAA 
GTTTTGGAAA 
GTTGCAGGTC 
TTTTTCCATG 
TGTTTTTCAT 
TGTATTCAGT 
• CCTTGATTTC 
ATAAAAGAAA 
TGCCTTAATT 
CTCATTCCAT 
TTCCTTACTC 
TATCCATTTG 
AAAGCAAAAG 
CAGGAATTTT 
CTTCTAATTA 
TTTTTTTCAG 
ACTTTCCAGA 
CTGGAATAAG 
CAACACTTAG 
GGGTAATATA 
ACACATACCT 
AAGGTAAGGC 
AGAACCTGTC 
CCATCTAATT 
AGTTTTCAGT 
AATCAACTGT 
AATGAACTTT 
ATAGTTATAT 
CAGTATGTTT 
TAATGGGCCA 
ACTGGATAGA 
CTAATTACTG 



CTGCTGATGA 
CACTATGCAG 
GCGGGACAGG 
TTGATCTGCT 
CCCGAGCTCA 
CTCOGTGATG 
TACGAGCATG 
GCTCTGACTC 
CCCAAGAAAA 
TGTCTGGGAC 
AGCTCCAGCC 
CCATCACCCT 
TGCCAGCCAG 
AGTGCCGCTG 
CCAGCTCATG 
ATTCCTCTAT 
CTTTTTTTCT 
AAATGTACTA 
CAAGGCATTT 
CACGTACCTC 
TTTACAGAGC 
CACATTTTTG 
TATTGCCACG 
CCAAGTTGAT 
AAAGTGGACA 
ACTTTCTCTC 
GATCTGAGCA 
TTTCTATTTG 
GTAAATGACA 
ATCCTCCCAA 
TCCCCTGCCC 
CTCCCAGTAA 
TGTATCATAG 
GTCTTAGTTG 
TTGCATGAGC 
TTCAAGCTCC 
CACTTAAGAA 
GGAGGCTGAG 
GTGAGATCCT 
GTAGTCCCAG 
TGCAGTGAGC 
AAAAAAAAAA 
GCTAAAGATT 
CAAAGCTTGA 
GTATGAACTA 
TAGTCCTGTA 
CTTGCTAACG 
CCAGGTGTGT 
CAGGAAGTAA 
GACTGTTGTT 
AATAAAACAA 



GCTACGCCAA 
TTACTGTGAC 
AGGACTACAA 
TCTCTGTCGT 
AGGACTGCAT 
ACCCAAAAAC 
GTGTGAAGCT 
AGAAAGGTCT 
AGAAGAAACG 
CTGCCTCCAC 
AAAAAGGAGG 
CTGAGCCCTC 
AAGCATCCGT 
CTGATCGCAT 
AACGTGAAGC 
TGCTGGCCTT 
GAATCTGCTG 
ATTTCCAGTT 
TATATTCATT 
TGAATGCCCG 
CATGATTTTT 
CCAAAGATAC 
ACAAACTAAA 
TGGGGGAAGA 
TTCCTGCTCC 
ATGCCCGGAT 
TAAAGATACG 
CTTCAACTGA 
TTTTCCAGTT 
ATGTCTTTGT 
TCCACAATGT 
GGAATCCTGT 
AGCGAATTAC 
CTTATAAGTG 
AAAGTGCTTC 
CACTGTTGGA 
TTGCGTGATA 
GTGGGTGGGC 
GTGTCTCTAT 
CTACTCAGGA 
TGTGACTGTG 
AAAAAACAAC 
TTCTTTCATA 
CATTTAGAGA 
TAACTCTGCA 
ATAAATGAAA 
GGCCACTCAT 
GCACTCAACA 
GTTGATCTTG 
AGTGTCTGGG 
TCTAGAACAA 



CGACGCCTTC 
TGTGGGAGGC 
CCAGCTGAGG 
AAACCCTGCC 
GCCTCACGTG 
CTTGGCCCGT 
CGCAAAAGCG 
CAAAGCGGTT 
CTGTTCTGAG 
CCCATCCAGG 
GCACGACCAG 
CCAACACAGC 
ACTGCACGCT 
CAAAAACAAA 
TGATAGGAAA 
ACTTGATGTC 
TTCTACCCAT 
CACTCAGGCC 
TCTATTTTCA 
ATTATAAGAA 
GAACCTAATT 
ACTCTATAGA 
AATGAACTGT 
ATATGGCAGG 
TCCCTTCCCC 
CCTTTTATTC 
TGTTTAAAAA 
AAGTGCTTCT 
ACAACTGGTA 
GGGAGCCATA 
GTGACATAGA 
GCCCAATGAT 
TTCCTATCTT 
CCCTGGAATC 
TTAGTAGTGT 
AAAAGCCAGC 
GCCAGGCACC 
CGCTTGAGCT 
AAAAAAATTA 
GGCTGAGGTG 
CCACTACACT 
CTACATTTCA 
CGCACACACT 
AAACAAGGAC 
GAGGTTATGA 
TGTTATTAGG 
TTCTCACTGA 
GGCAAATAGC 
ATGGGGAGAT 
TAGAGCACAG 
AGCAA 



CCAGAGGAAT 
AAGCAACACT 
CCACTCTCCT 
TCTTACCACA 
CCTTATGTCC 
TTGCTGTATA 
ATCGGAGCAC 
TTTGATGAAG 
GGTCACAGCT 
GATGAGAATG 
AAAGGAACTC 
ACACTAGTCA 
GTCTGAGAAT 
GTCAAAGGCC 
TCACCCCAGG 
TTTTATAAAA 
GTGTCTCACA 
TTACTAATCC 
GCATGTTTCT 
GACATGAGAA 
GAAAGAAAAC 
TGCTTAGTAG 
GTTTAAGAAT 
ATCCATCTTT 
CATTGCATGC 
TCCCCAGTTA 
TAACTAAAAG 
CAGCTCGCCC 
CTGAGATTTT 
TCAGTGGATA 
ACAGGGACTT 
GTAAAACAAT 
TTCATTAGAG 
ACCCAGGTAG 
GAAATTACAA 
CTTTCTAATC 
GTGGCTCATG 
CAGGAGTTCA 
AAAATTAGTC 
GAAGGATCAC 
CCAGCCTGAG 
AGTACTATTT 
CCAGTGACTG 
TTTCTGCCTT 
ATTCATCCTT 
CAGCTTTGTT 
TGTGGATGAA 
TCCCGAGGTC 
CACGTCACCC 
GCTCCCAGGG 



ACGTGCCCAC 
TGCTCGGACT 
ACCCCAACAC 
ATGTCCAGGA 
TCATAGGGAC 
TGAAAGAGAA 
AGTGCTACTT 
CAATCCTCAC 
GCTGTTCAAT 
GCAGCCAATC 
CCTTTGCACG 
GCCCACTGCC 
GCTGGGCCTG 
ATCTCACATT 
GAACCCGAAA 
CTTGGGACTA 
TTCATTTGTA 
ATACCAAATT 
ACCAAAGCTA 
GACTTTAAAA 
CATCTGAATT 
TGGCCTGATT 
GTAGTATTTC 
TACAGTATTT 

TAACCCAGTT 
TAAAGGAAAG 
CATGTAAGTT 
GCCTCTCTCT 
CCAAGCTCTG 
TGGCCCTGGG 
TCCAAACATC 
GCTATGAGGA 
GCACTTAATT 
CAACTTTAAG 
TCTTCTGCTA 
CCTGTAATCC 
AGACCAGCCT 
AGTTGTAGTG 
TTGAGCCCAG 
TGACAGAGAA 
CCCTTCTCTC 
GAAAAACGGG 
TATAAATGGA 
TACAAACAAT 
GCATGATTGC 
AAAATGAGAG 
ACCACTTCCC 
AGAACCAGCA 
GTCTTAAGAG 



600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 



Seg ID No: 95 Protein sequence: 
Protein Accession #: CAC066U.1 



1 ' 11 21 31 41 51 

I I I 1 I I 

MNCKEGTDSS CGCRGNDEKK MLKCVWGDG AVGKTCLLMS YANDAFPEEY VPTVPDHYAV 
60 TVTVGGXQHL LGLYDTAGQE DYNQLRPLSY PNTDVFL1CP SWNPASYHN VQEEWVPELK 
DCMPHVPYVL IGTQIDLRDD PKTLARLLYM KEKPLTYEHG VKLAKAIGAQ CYLECSALTQ 
KGUCAVFDEA ILTIFHPKKK KKRCSEGHSC CSII 



60 
120 
180 



65 



70 



75 



.Seg ID NO: 96 DNA secruence 
Nucleic Acid Accession #: NM_003654.1 

Coding sequence: 367-1602 (underlined sequences correspond to start and stop codons) 



1 
I 

GGGGAGGGCG 
GTCCCCGGCG 
CCCGGCGCGT 
TCCCCAGCTG 
GGCTGCCGCA 
CCAGCTTGGA 



11 

I 

CGGGAGGCGG 
ACCCTACTCC 
CCCCGACCAG 
CATTCCCGGA 
CTGGCTGGGA 
GCAGTCCCTC 



21 
I 

AGGATGCCGC 
AGACCCGAGG 
GTAGCTGGTG 
GGCGCCCTTT 
CTGCCAGCTG 
TTTGACCTCA 



31 
I 

CGCGGCTGCT 
ATGGAGCCGG 
TCACTTCGGT 
CGACCTGGAG 
GGCCTGGAGA 
CCCCTTGGAG 



41 

I 

GCCGCCGCCG 
CGCTGGGCGC 
GTGGTTGGAA 
GCCGGGTCTG 
CGCTGGTGGC 
AAGCAGCCCC 



51 
I 

CCACCCGCGG 
TGCAGCTGCT 
GAAGACTTTC 
CTGGCCACAG 
TGTGGACTCC 
ATGAAGGTGC 



60 
120 
180 
240 
300 
360 
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10 



15 



20 



25 



30 



35 



CCAGCCATGC 
CAGTACACGG 
GAGGCCGGGC 
CGCAAGACCC 
CTCTTCAACC 
ACGCTCATCC 
GCCAGCOGCG 
ATCAAGCCGC 
GTCCTCTGCT 
GGGGACTGTG 
GAGCGCAGCC 
CTGGTGGAAG 
ATTCTGGCTT 
GGCACCGGGA 
TTCTCCAACT 
TTGGTGOGCT 
TTCCTGGGCA 
GACCCCACCC 
AAGTGGCGCT 
GTGCTGGCCC 
GTCAGCCTGG 
GGGCGGGAGG 
TAACCCCTCC 
CTTCTGCCCC 
AGAGGGCGCC 
TCTCTGTGCG 
CGGGCACTCG 
TTTGCACTGT 
TTTATGAATG 
CCCTTAGAGC 
TTACTGTGAG 
GTCTGAGTCT 
TTGATCTCGG 
CAGTTTGCTA 
AGTGCAATAA 



AATGTTCCTG 
CCATCCGCAC 
TGGCCGAGCG 
ACATCCTCAT 
AGCACCTGGA 
CCCGCTTCAC 
ACCTCCTGCG 
CGCCGGTCAA 
CCCGGCCTGT 
TGCGCAAGTG 
ACGTGGCCAT 
ACCCGCGATT 
OGCGCAGCGA 
GGAAACCCTA 
CCGTGTCCAC 
ACGAGGACCT 
TCCCGCTGGA 
TGGGCAAGCA 
TCCGCCTCTC 
AGCTGGGCTA 
TGGAGGAGCG 
CGCAAGGTGT 
CTCTCCCACC 
TTTTTTGTCT 
TGAAGTAGGG 
GACGGTGACA 
CGAGGCGACT 
CTTACTATTC 
GTGTCCATCC 
AGCGAAACTG 
GTGAACGTGG 
CGTGGCCGCC 
GGTCCATCTG 
AATGAACATT 
TCACC 



GAAGGCCGTC 
CTTCACCGCC 
ACTGTGCGAG 
CCTGGCCACC 
CGTCTTCTAC 
CCAGGGCAAG 
GAGCCTCTAC 
CCACACCACC 
GTGCGACCCT 
CGGGCTACTC 
CAAGACGGTG 
AAACCTCAAG 
GACCTTCCGC 
CAACCTGGAC 
CGGCCTCATG 
GGCTCGGAAC 
CAGCCACGTG 
CAAATACGGC 
CTACGACATC 
CAAGATCGCC 
GGACTTCCGC 
CGGTTTTGAT 
TCATCTTCGT 
CTGAAATTTG 
TCCCGCCCCC 
ATGTTTACAA 
TCTCAAGCTT 
AAGGTAAGAG 
TTTCCCCATC 
CCCCCTCCTG 
ACCTGTTTCT 
CCTGGACCAG 
TGATATTTCT 
GAAATTGAAA 



CTCCTCCTTG CCCTGGCCTC 
AAGTCCTTTC ACACCTGCCC 
GAGAGCCCCA CCTTCGCCTA 
AOGCGCAGCG GCTCCTCCTT 
CTGTTTGAGC CCCTCTACCA 
AGCCCGGCCG ACCGGCGGGT 
GACTGCGACC TCTACTTCCT 
GACAGGATCT TCCGCCGOGG 
CCGGGGCCAG CCGACCTGGT 
AACCTGACCG TGGCGGCCGA 
CGCGTGCCCG AGGTGAACGA 
GTCATCCAGC TGGTCCGAGA 
GACACGTACC GGCTCTGGCG 
GTGACGCAGC TGACCACGGT 
CGGCCCCCGT GGCTCAAGGG 
CCTATGAAGA AGACCGAGGA 
GCCCGCTGGA TCCAGAACAA 
ACCGTGCGAA ACTCGGCGGC 
GTGGCCTTTG CCCAGAACGC 
GCCTCGGAGG AGGAGCTGAA 
CCCTTCTC GT GA CCCGGGCG 
AAAATGGACC GTTTTTAACT 
GTCCTTCCTG CCCCCAGCTC 
CACTACGTCT TGGACGGGAA 
CCCACCCCAT TCAGACACAT 
GCACCACATT TACACATCCA 
TTGAATGGGT GAGTGGTCGG 
GATACAAACA AGAGGACCAC 
CCTGCCTCCT GCCCCTGACG 
CCCGCCCTTG CCTGTCGGTG 
GTTTCCAGTC TGTGGTGATG 
TGATGACTGA TGAATCTTAT 
TTGTGCCAAA AAGAAAAAAA 
TGCTTTATCT GTGTTTTCTG 



CATTGCCATC 
CGGGCTGGCA 
C3UVCCTCTCC 
CGTGGGCCAG 
CGTCCAGAAC 
CATGCTAGGC 
GGAGAACTAC 
GGCCAGCCGG 
CCTGGAGGAG 
GGCGTGCCGC 
CCTGCGCGCC 
CCCCCGCGGC 
GCTCTGGTAC 
GTGCGAGGAC 
CAAGTACATG 
GATCTACGGG 
CACGCGGGGC 
CACGGCCGAG 
CTGCCAGCAG 
GAACCCCTCG 
GTGCGGGTGG 
GTTGCCTTAT 
ACCCCACTCC 
TCACTGGGGC 
GGATGTTGGG 
CACACGCACA 
GTATCTAGTT 
TTGTCTCTAA 
CCCATTTCCC 
AGGCAGGTTT 
CTGTCTGTCT 
6AGCTTCTGA 
AAGAGTGGAT 
TAAATAAAAG 



420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 



Seq ID No: 97 Protein sequence: 
Protein Accession #: NP 003645.1 



40 



45 



50 



MQCSWKAVLL 
THILILATTR 
RDLLRSLYDC 
CVRKCGLLNL 
ASRSETFRDT 
RYEDLARNPM 
RFRLSYDIVA 



11 
I 

LALASIAIQY 
SGSSFVGQLP 
DLYFLENYIK 
TVAAEACRER 
YRLWRLWYGT 
KKTEEIYGFL 
FAQNACQQVL 



21 
I 

TAIRTPTAKS 
NQHLDVFYIiP 
PPPVNHTTDR 
SHVAIKTVRV 
GRKPYNLDVT 
GIPLDSHVAR 
AQDGYKIAAS 



31 
I 

FHTCPGLAEA 
EPLYHVQNTL 
IFRRGASRVL 
PEVNDLRALV 
QLTTVCEDFS 
WIQNNTRGDP 
EEELKNPSVS 



41 
I 

GLAERLCEES 
IPRFTQGKSP 
CSRPVCDPPG 
EDPRLNLKVI 
NSVSTGLMRP 
TLGKHKYGTV 
LVEERDFRPP 



51 
I 

PTFAYNLSRK 60 

ADRRVMLGAS 120 

PADLVLEBGD 180 

QLVRDPRGIL 240 

PWLKGKYMLV 300 

RNSAATAEKW 360 
S 



55 



60 



65 



70 



75 



Seq ID NO: 98 DNA sequence, 

Nucleic Acid Accession #: NM_002852.1 

Coding sequence: 68-1213 (underlined sequences correspond to start and stop codons) 



1 
I 

CTCAAACTCA 
TCCAGCAATG 
GAACTCGGAT 
CCATCCCACT 
GCTCTTCATC 
CGACGTCCTG 
CCTGGCGAGG 
CGAGCTGCTG 
GGCGCAGCGC 
GACGCGAGCC 
TTGTGAAACA 
AGTGAGACCA 
ATTAAACAAA 
GTATCTCAGC 
TGAAGCCATG 
AGGGCTCACA 
AGGTCACATT 
TGTGGGTGGT 
CTGGGATAGT 



11 
I 

GCTCACTTGA 
CATCTCCTTG 
GATTATGATC 
GAGGACCCCA 
ATGCTGGAGA 
CGGGGCGAGC 
CCGTGCGCGC 
CAGGOGACCC 
CCAGAGGAGG 
GACCTGCACG 
GCTATTTTAT 
ATGAGGCTTG 
ACCATCCTGT 
TACCAATCCA 
GTTTCCCTGG 
TCCTTGTGGG 
GTTCCTGAGG 
GGCTTTGATG 
GTTCTTAGCA 



21 
I 

GAGTCTCCTC 
CGATTCTGTT 
TCATGTATGT 
CGCCGTGCGA 
ACTCGCAGAT 
TGCAGAGGCT 
CGGGGGCTCC 
GCGACGCGGG 
CGGGGCGCGC 
CGGTGCAGGG 
TCCCAATGCG 
AGTCTTTTAG 
TTTCCTATGG 
TAGTGTTTGT 
GAAGGTGGAC 
TAAATGGTGA 
GAGGAATCCT 
AAACATTAGC 
ATGAAGAGAT 



31 
I 

CCGCCAGCTG 
TTGTGCTCTC 
GAATTTGGAC 
CTGCGGTCAG 
GAGAGAGCGC 
GCGGGAGGAG 
CGCAGAGGCC 
CCGCAGGCTG 
CCTGGCCGCG 
CTGGGCTGCC 
TTCCAAGAAG 
TGCCTGCATT 
CACAAAGAGG 
GGTGGGTGGA 
CXACCTGTGC 
ACTGGCGGCT 
GCAGATTGGC 
CTTCTCTGGG 
AAGAGAGACC 



41 

I 

TGGAAAGAAC 
TGGTCTGCAG 
AACGAAATAG 
GAGCACTCGG 
ATGCTGCTGC 
CTGGGCCGGC 
AGGCTGACCA 
GCGCGTATGG 
GTGCTAGAGG 
CGGAGCTGGC 
ATTTTTGGAA 
TGGGTCAAAG 
AATCCATATG 
GAGGAGAACA 
GGCACCTGGA 
ACCACTGTTG 
CAAGAAAAGA 
AGACTCACAG 
GGAGGAGCAG 



51 
I 

TTTGCGTCTC 
TGTTGGCCGA 
ACAATGGACT 
AATGGGACAA 
AAGCCACGGA 
TCGCGGAAAG 
GTGCTCTGGA 
AGGGCGCGGA 
AGCTGCGGCA 
TGCCGGCAGG 
GCGTGCATCC 
CCACAGATGT 
AAATCCAGCT 
AACTGGTTGC 
ATTCAGAGGA 
AGATGGCCAC 
ATGGCTGCTG 
GCTTCAATAT 
AGTCTTGTCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



226 



WO 02/079492 



PCT/US02/04915 



10 



CATCCGGGGG 
GTATGTTTCA 
AACACATSCC 
TGAAAGAGAG 
AAGGAAAGAC 
TTTCAGTTTA 
TGTTGAACAG 
GAATTTTACA 
TATGTACCTT 
ACTATAAATG 
AAGTTATATT 
AAATAAAATA 



AATATTGTTG 
TAAA TGTTGT 
AGTTGGGAAG 
AGTTGAGACC 
ATTGGAAAAA 
ATGCTGTGTC 
AGGGACAATT 
TTGGAAGAAT 
ATTACAAAAA 
TAGTTTATGT 
GCAAAAGGGA 
TTTTATAAAA 



GGTGGGGAGT 
GAAACTCCAC 
GTCTGAAAAC 
AATCTTTATT 
GCTTTTGAGG 
TCTGTCAGAT 
GTTTTACTTT 
AACAAAATAA 
AAATGATGAA 
GTTATAATCG 
TTTGTATTAA 
CTAAAAAAAA 



CACAGAGATC 
TTGAAGCCAA 
TCAGTGCATA 
TGTACTGGCC 
ATAATGTTAC 
AAACTCTCAA 
TCTTTGGTTA 
GATTTGTTGT 
AACATATTTA 
AATGTCACGT 
TTTAAGACTA 
AAAAAAA 



CAGCCACATG 
AGAAAGAAAC 
ATAGGAACAC 
AAATACTGAA 
TAGACTTTAT 
ATAATTAAAA 
ATTTTGTTTT 
CCATTGTTCA 
TACTACAAGG 
TTTTGAGAAG 
TTTTTGTAAA 



GAGGAGCTCA 
TCACACTTAA 
TTGAGACTAA 
TAAACAGTTG 
GCCATGGTGC 
AGGACTGTAT 
GGCCAGAGAT 
TTGTTATTGG 
TGACTTAACA 
ATAGTCATAT 
GCTCTACTGT 



1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



15 



Seq ID No: 99 Protein sequence: 
Protein Accession #: NP 002843.1 



20 



25 



MHIiLAIIiFCA 
IMLENSQMRE 
LQATRDAGRR 
TAILPPMRSK 
SYQSIVFWG 
IVPEGGILQI 
GNIVGWGVTB 



11 
I 

LWSAVLAENS 
RMLLQATDDV 
LARKEGAEAQ 
KIFGSVHPVR 
GKENKLVAEA 
GQEKNGCCVG 
IQPHGGAOYV 



21 
I 

DDYDLMYVNL 
LRGELQRLRE 
RPKEAGRAIiA 
PMRLBSFSAC 
MVSLGRWTHL 
GGFDETLAPS 
S 



31 
I 

DNEIDNGLHP 
ELGRLAESLA 
AVLEELRQTR 
IWVKATDVItf 
CGTWNSBEGL 
GRLTGFNIWD 



41 
I 

TEDPTPCDCG 
RPCAPGAPAS 
ADLHAVQGWA 
KTILFSYGTK 
TSLWVNGELA 
SVLSNEEIRE 



51 

I 

QEHSEWDKLF 60 

ARLTSALDEL 120 

ARSWLPAGCE 180 

RNPYEIQLYL 240 

ATTVEMATGH 300 

TGGAESCHIR 360 



30 



Seq ID NO: 100 DNA sequence 

Nucleic Acid Accession #: NMJJ07351.1 

Coding sequence: 72-3758 (underlined sequences correspond to start and stop codons) 



35 



40 



45 



50 



55 



60 



65 



70 



75 



CTGCTATCAA 
AAACTACTGA 
GCATTGGGCT 
AGACTATGCC 
CCACTCGGGT 
GTCTTCTTAA 
ATCAAACTCT 
TCCCAACCAA 
CTACACTGAA 
ACACAGTTGG 
GAGCCCCACG 
ACCAAAAATC 
GGTTATCTCC 
CTTGTGGCTG 
ATAGGATGCA 
GGCCGAAATG 
AAAGTCATAC 
ACCCAGAAGT 
TTCTGCAGAA 
CCTCCCTAGA 
GTCTAAAATC 
TTTTTCAAAA 
CAGAGGACCT 
TAGCAGCCCA 
TGGAACTAAG 
CTATTAAAGA 
CAAGAAGCAT 
ATGAGCAGCT 
CAGTTAGCAA 
GTTTGATGAT 
TCACCGTCTC 
CCAAATGCAG 
TAAATCAAAC 
AGCAACTAAA 
CATCACTCAG 
AGATAGAAAA 
AAAGACACAA 
TCAATGAATA 
ATGCTATTGA 
ATAATAGTGA 
CTCAGTTCCA 



11 
I 

AAAGGCCATA 
GATGAAGGGG 
TAACAACAGT 
TTCTGCTTCA 
CATGTCGGCG 
ATCAACACTG 
CACATCCACA 
CGCTAGCATC 
ATTTCTTCAG 
AGGCACTGGA 
GGAAACATAC 
AAATTTCGAA 
CACAGTGACA 
GACCGGTGGA 
ACATAAAATT 
TCAACTAAGA 
AGCTGTTGGC 
GATGCAAAAA 
GAAGATTGAC 
AGGAAAAGTC 
CAAAAGCATT 
TGACATGCAA 
CGAAAGCACC 
GCAAAAGTTT 
GAATCACATT 
ACTAGAAGTA 
TCTGTATTAT 
TTTATCAACT 
TAATGTCACT 
GCTGCAAATG 
TTTGGAGATG 
AAATGATTTT 
ATTGGCTGAA 
TGATTTGACT 
ACAGACAATG 
TCTGACTAGT 
CTTACTTAGA 
TGCCTTAGAA 
TTTCATTCAA 
GATCCATCAT 
CCGTCTGAAT 



21 
I 

AGGATTTTGT 
GCAAGATTAT 
AAGCATTCTT 
GTTCCTCCAA 
GAGATAGCTA 
CCTCCCTCAG 
GAGAAAGCAG 
AAGTTCAATC 
AGCTTTGCCA 
GGCATTGGAG 
CTCAGCCGGG 
ACAACTAGAG 
TTGGACAACC 
TCCTGTCCTC 
GTCACCTCAT 
GCCCAGGAAC 
AGAGGAGTAG 
ATGACTGATC 
AATATTTCTT 
AGCGAAGATA 
AATGTACTGA 
GAGACTGTAG 
AGGCAAATAA 
GTTTTGGTGC 
GTGAATGTAA 
AAGCAGACTC 
GAATCCCTCA 
GAACAGGTAT 
GAGTACATGT 
TTTGAAGATT 
GAGAAAGAGT 
AAATTTCAAC 
GTTCTCTTTC 
TATGATATGG 
ACATATGAAC 
GCTGTCAATA 
AATGAAGTAC 
ATGGAAGATG 
GATAACTATG 
AAATGTACCT 
GATTCTATTC 



31 
I 

CCCCAAATTT 
TTGTCCTTCT 
GGACTATACC 
ATAAAATACA 
CAACTCCAGA 
AAACAAGTGC 
AAGGAGTGGT 
CTGGAGCAGA 
GAAAGTCAAA 
GCGTTGGAGG 
GTGACAGCAG 
GAAAGAATTG 
AGGTCACTTA 
AGAGATCTCA 
TGGATTGGAG 
AGCAAAGTTT 
CTGAGCAGCA 
AGGTGAACTA 
TGACTGTGAA 
AAAGCAGAGA 
TAAGAGACAT 
CACAGCTCTT 
TTCAAAAAGT 
AAGAGAATCG 
GGCAAGAAAT 
ATTTAGAAGG 
ATAAAACTCT 
CAGACCAGAA 
CTACTTTACA 
TGCACATTCA 
CTCTCAGAGG 
TTAAGGACAC 
CAATGGACAA 
AGATCCTTCA 
AACCAAAGGA 
GTCTAAATTT 
AGGGTCGTGA 
GCCTCAATAA 
CCCTAAAAGA 
CCGATATGGA 
AGACTTTGGT 



41 
I 

CACATGAGCT 
TTCTAGTTTA 
TGAGGATGGG 
AAGTTTGCAA 
GGCAAGAACT 
ACCTGCTGAG 
CAAGTTACAG 
ATCAGTGGTC 
TGAACAAGCA 
CACTGGAGGC 
TTCCAGCCAA 
GTGTGCTTAT 
TGTCCCAGGT 
GAAGATATCC 
GTGCTGTCCT 
GATACACACC 
GCAGCAGCAA 
CCAGGCAATG 
TGATGTAAGG 
ATTTCAATCT 
AGTAAGAGAA 
CAAGACTGTA 
TAATGAATCT 
GCCCACTTTG 
GACTCTTACA 
TGCTCTAGAA 
TTCTAAATTG 
GAATGCTCCA 
TGAAAATATA 
AGAAAGCAAG 
TGAATGTGAA 
AGAAGAGAAT 
TAAGATGGAC 
ACCCTTGCTT 
AGCAATAGTG 
TATTATCAAA 
TGATGCCTTA 
GACAATGACT 
GACTTTAAGT 
AACTATTTTG 
CAATGACAAT 



51 
I 

ACCTTGCTTC 
TGGAGTGGGG 
AACTCTCAGA 
ATACTGCCAA 
TCTGAAGACA 
GGTGTGAGAA 
AATCTTACCC 
CTTTCCAATT 
ACTTCTCTAA 
GTGGGAAATC 
AGAACTGACT 
GTACATACCA 
GGGAAAGGAC 
AATCCTGTCT 
GGATACAGTG 
AACCAGGCTG 
GGCTGTGGTG 
AAACTGACTC 
AACACTTACT 
CTTCTAAAAG 
CAATTTAAAA 
TCAAGTCTAT 
GTGGTTTCAA 
ACTGATATAG 
TGTGAGAAGC 
CAGGAACACT 
AAGGAAGTAC 
GCTGCTGAGT 
AAGAAGCAGA 
ATTAACAATC 
GACATGTTAT 
TTACATGTGT 
AAAATGAGTG 
GAGCAGGGAG 
ATAAGGAAAA 
GAACTTACAA 
GAAAGACGTA 
ATTATAAATA 
ACTATTAAGG 
ACATTTATTC 
CAGAGATATA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
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1440 
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1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
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ACTTTGTTTT 
AGTCCAACTT 
ACCAGCAAAA 
ATTTTGAGAC 
ATATTTCAGT 
AAGTATTAAA 
TCTTTTOGCT 
GTGTGTCAGA 
AACTTCTTCA 
CTGCCCTATC 
TTGTCAAGTC 
CAACGGTAAA 
TATATCCTGA 
TAAATGGAAG 
CTATCAAGCT 
ATGCACCCAT 
TCCTGTTTAA 
TTAGAATTCC 
ATATTTCTGG 
TTAACAGTGA 
ATGGGCA66A 
TTACTACATT 
CACCTTTATT 
TTGGTTTTTC 
TCTATTTTAT 
CTAAAGAAAT 
TTATTTCTTC 
ATATTATCAG 
TAAATATATA 



GCAAGTCGCC 
CCAAAAGATG 
TATGAGTCAT 
TCGGTTGCAA 
TAAAAAAGGC 
TTCCAGATTT 
TAACAAAACT 
ACTGAATGCT 
GAAAGGTCTA 
TAATTCAACT 
TCAGAAGCAA 
TCTTACCACA 
GGAGTATTCA 
AACTAGCTTT 
TGTGGAAGAA 
GGTGGCATTT 
TAACTTGGAT 
GTATCTTGGA 
ATTTTTAGTG 
AATACACTGT 
AGTCTGGTTA 
TAGTGGCTAT 
GAGAAACAGC 
TACAGGAAAT 
AAAATTATTT 
TTAGTGGCAC 
ATTTTAAGTC 
TCACAGTTTT 
ACACACATTT 



AAGACCCTTG 
TATCAAATGT 
TTGGAAGAAA 
GACATTGAGT 
AGTGTAGTTA 
AAGGCGTTGG 
CTCCACGAAG 
ACCATCCCTA 
ACAGAATTTG 
TGTTGTATAG 
GTAAAATCAT 
GTCCTGATAG 
AGCTGTAGTC 
ACCTGTGCCT 
AATGCTTTAG 
TTTGCATCTC 
GTCAATTATG 
GTATATGTTT 
GTTGATGGAA 
GATAGGGTTT 
CGACTTGCAA 
TTATTATATC 
CAGTG'ITITC 
GAAAATCAAC 
GAATATTGTT 
AGAAAACAAA 
ATTGCAATGG 
CTTTCCAATT 
TCTAGATTCA 



CAGGTATTCC 
TCAATGAAAC 
AACTACTCTT 
CTAAAGTTAC 
CAAATGAGAG 
AAGCAAAATC 
TTTTAACAAT 
AGTGGATAAA 
TGGAACCAAT 
ATCGATCGTT 
TGCCAAAGAA 
GCCGGACTCA 
GGCATCOGTG 
GCAGACATCC 
CTCCAGATTT 
ATACGTATGG 
GAGCTTCATA 
TCAAGTACAC 
TAGACAAGCT 
TAACTGGGGA 
AAGGAACAAT 
GTACATAAGT 
ATTTATCTTT 
TTGTTTTTTT 
TAATGTCTGA 
GTGAATTTGT 
AAAGTAATAT 
AAACACTTAA 
CAAATTTAAA 



CAGAGATGAG 
CACTTCCCAA 
AACTACCAAG 
CCAGACGCTC 
AGATCAGGCT 
TATCCATCTT 
GTGTCACAAT 
ACATTCCCTG 
AATTCAAATA 
GCCTGGTAGT 
AATTAACGCA 
AAGAAACACG 
CCAAAATGGG 
TTTTACTGGT 
TTCCAAAGGA 
AATGACTATA 
TACCCCAAGA 
CATCGAGTCA 
TGCATTTGAG 
TGCCTTATTA 
TCCAGCCAAG 
TAGTATGAAA 
GCTTGCACAT 
AATATGAGTA 
ATATGAAAGA 
TAGCATAATT 
TATAAAACGG 
CTTTTGTTAT 
TAAATTACTC 



AAACTAAATC 
GTGAGAAAAT 
ATTTCCAAAA 
ATACCTTATT 
CTTCAACTGC 
TCAATTAACT 
GCTTCTACAA 
CCAGATATTC 
AAAACTCAAG 
CTGGCAAATG 
CTTAAGAAAC 
GACAACATAA 
GGCACGTGCA 
GACAACTGCA 
TCTTACAGAT 
CCTGGTCCTA 
ACTGGAAAAT 
TTTAGTGCTC 
TCTGAAAATA 
GAATTAAATT 
TTTCCCCCTG 
AACAGACTAT 
CTGCTCTGTT 
AACTTGTATG 
GTTCTTGATC 
ATTCCTATTC 
TAATTACAAC 
TCCCTGTATA 
AAAAAATG 



2520 
2560 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 



Seq ID No: 101 Protein sequence: 
Protein Accession #: NP 031377.1 
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MKGARLFVLh 
MSAEIATTPE 
ASIKFNPGAS 
ETYLSRGDSS 
TGGSCPQRSQ 
AVGRGVAEQQ 
GKVSEDKSRE 
ESTRQIIQKV 
LEVKQTHLEG 
NVTEYMSTLH 
NDFKFQLKDT 
QTMTYEQPKE 
ALEKEDGLNK 
RLNDSIQTLV 
MSHLEEKLLL 
SRFKALEAKS 
KGLTEPVEPI 
LTTVLIGRTQ 
VEENALAPDF 
YLGVYVFKYT 
WTLRIiAKGTI 



11 
I 

SSLWSGGIGh 
ARTS EDS LLK 
SWLSNSTLK 
SSQRTDYQKS 
KISNPVYRMQ 
QQQGCGDPEV 
FQSLIiKGLKS 
NESWSIAAQ 
ALEQEHSRSI 
ENIKKQSLMM 
EENIiHVLNQT 
AIVIRKKIEN 
TMTIIKNAID 
NDNQRYNFVL 
TTKISKNPET 
IHLSINFFSL 
IQIKTQAAIiS 
RNTDNIIYPE 
SKGSYRYAPM 
IESFSAHISG 
PAKFPPVTTF 



21 

I 

NNSKHSWTIP 
STIiPPSETSA 
FXQSFARKSN 
NFETTRGKNW 
HKIVTSLDWR 
MQKMTDQVNY 
KSINVLIRDI 
QKFVLVQENR 
LYYBSLNKTL 
LQMFEDLHIQ 
LAEVLFPMDN 
IiTSAVNSIiNF 
FIQDNYALKB 
QVAKTLAGIP 
RliQDIESKVT 
NKTLHEVLTM 
NSTCCIDRSL 
EYSSCSRHPC 
VAFFASHTYG 
FLWDGIDKL 
SGYLLYRT 



31 
I 

EDGNSQKTMP 
PAEGVRNQTL 
EQATSIiNTVG 
CAYVHTRLSP 
CCPGYSGPKC 
QAMKLTLLQK 
VREQFKIFQN- 
PTLTDIVELR 
SKLKEVHEQL 
ESKINNLTVS 
KMDKMSEQLN 
IIKELTKRHN 
TLSTIKDNSE 
RDEKLNQSNF 
QTLIPYYISV 
CHNASTSVSE 
PGSLAKWKS 
QNGGTCINGR 
MTIPGPILFN 
AFESENINSE 



41 

I 

SASVPPNKIQ 
TSTEKAEGW 
GTGGIGGVGG 
TVTLDNQVTY 
QLRAQEQQSL 
KIDNISLTVN 
DMQETVAQLF 
NHIVNVRQEM 
LSTEQVSDQK 
LEMEKESLRG 
DLTYDMEILQ 
LLRNEVQGRD 
IHHKCTSDME 
QKMYQMFNET 
KKGSWTNER 
LNATIPKWIK 
QKQVKSLPKK 
TSFTCACRHP 
NLDVNYGASY 
IHCDRVLTGD 



51 
I 

SLQILPTTRV 
KLQNLTLPTN 
TGGVGNRAPR 
VPGGKGPCGW 
IHTNQAESHT 
DVRNTYSSLE 
KTVSSLSEDL 
TLTCEKPIKE 
NAPAABSVSN 
ECEDMLSKCR 
PLLEQGASLR 
DALERRINEY 
TILTFIPQFH 
TSQVRKYQQN 
DQALQLQVLN 
HSLPDIQLIiQ 
INALKKPTVN 
FTGDNCTIKL 
TPRTGKFRIP 
ALLELNYGQE 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
900 
960 
1020 
1080 
1140 
1200 



60 



Seq ID NO: 102 DNA sequence 

Nucleic Acid Accession #: NM_000873.2 

Coding sequence: 57-884 (underlined sequences correspond to start and stop codons) 



65 



70 



75 



i 
I 

ATCTCCCTCC 
CCTCTTTCGG 
CGGATGAGAA 
GGTCCCTCGA 
CCTCTCTAAA 
ACATCTCCCA 
TGAATTCCAA 
CTTTGGTGGC 
TGGACAGCCT 
GGAAGGCAGC 
AGGATGGCCA 
ACATCTTTCA 



11 
I 

AGGCAGCCCT 
TTACAGGACC 
GGTATTOGAG 
GGTCAACTGC 
TAAGATTCTG 
TGACACGGTC 
CGTCAGCGTG 
TGTGGGCAAG 
CACCCTCTTC 
CCCTGCTCCG 
CCGCAACTTC 
CAAACACTCA 



21 
I 

TGGCTGGTCC 
CTGACTGTGG 
GTACACGTGA 
AGCACCACCT 
CTGGACGAAC 
CTCCAATGCC 
TACCAGCCTC 
TCCTTCACCA 
CTGTTCCGTG 
CAGGAGGCCA 
TCCTGCCTGG 
GCCCCGAAGA 



31 
I 

CTGCGAGCCC 
CCCTCTTCAC 
GGCCAAAGAA 
GTAACCAGCC 
AGGCTCAGTG 
ACTTCACCTG 
CAAGGCAGGT 
TTGAGTGCAG 
GCAATGAGAC 
CAGCCACATT 
CTGTGCTGGA 
TGTTGGAGAT 



41 

I 

GTGGAGACTG 
CCTGATCTGC 
GCTGGCGGTT 
TGAAGTGGGT 
GAAACATTAC 
CTCCGGGAAG 
CATCCTGACA 
GGTGCCCACC 
TCTGCACTAT 
CAACAGCACG 
CTTGATGTCT 
CTATGAGCCT 



51 
I 

CCAGAGATGT 
TGTCCAGGAT 
GAGCCCAAAG 
GGTCTGGAGA 
TTGGTCTCAA 
CAGGAGTCAA 
CTGCAACCCA 
GTGGAGCCCC 
GAGACCTTCG 
GCTGACAGAG 
CGCGGTGGCA 
GTGTCGGACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
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GCCAGATGGT CATCATAGTC ACGGTGGTGT CGGTGTTGCT GTCCCTGTTC GTGACATCTG 780 

TCCTGCTCTG CTTCATCTTC GGCCAGCACT TGCGCCAGCA GCGGATGGGC ACCTACGGGG 840 

TGCGAGCGGC TTGGAGGAGG CTGCCCCAGG CCTTCCGGCC ATAGCAACCA TGAGTGGCAT 900 

GGCCACCACC ACGGTGGTCA CTGGAACTCA GTGTGACTCC TCAGGGTTGA GGTCCAGCCC 960 

TGGCTGAAGG ACTGTGACAG GCAGCAGAGA CTTGGGACAT TGCCTTTTCT AGCCCGAATA 1020 
CAAACACCTG GACTT 

Seq ID No: 103 Protein sequence: 
Protein Accession #: NP_000864.1 

1 11 21 31 41 51 

I I I I I I 

MSSFGYRTLT VALPTLICCP GSDEKVFEVH VRPKKLAVEP KGSLEVNCST TCNQPEVGGL 60 
ETSLNKILLD EQAQWKHYLV SNISHDTVLQ CHFTCSGKQE SMNSNVSVYQ PPRQVILTLQ 120 
PTLVAVGKSF TIECRVPTVB PLDSLTLFLF RGNETLHYET FGKAAPAPQE ATATFNSTAD 180 
REDGHRNFSC LAVIjDIWSRG GNIFHKHSAP KMLEIYEPVS DSQMVUVTV VSVLLSLPVT 240 
SVLLCPIFGQ HLRQQRMGTY GVRAAWRRLP QAFRP 

Seq ID NO: 104 DNA sequence 

Nucleic Acid Accession #: NM_001795.2 

Coding sequence: 121-2475 (underlined sequences correspond to staxt and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GACGGTCGGC TGACAGGCTC CACAGAGCTC CACTCACGCT CAGGCCCTGG ACGGACAGGC 60 

AGTCCAACGG AACAGAAACA TCCCTCAGCC CCACAGGCAC GATCTGTTCC TCCTGGGAAG 120 

ATGCAGAGGC TCATGATGCT CCTCGCCACA TCGGGCGCCT GCCTGGGCCT GCTGGCAGTG 180 

GCAGCAGTGG CAGCAGCAGG TGCTAACCCT GCCCAACGGG ACACCCACAG CCTGCTGCCC 240 

ACCCACCGGC GCCAAAAGAG AGATTGGATT TGGAACCAGA TGCACATTGA TGAAGAGAAA 300 

AACACCTCAC TTCCCCATCA TGTAGGCAAG ATCAAGTCAA GCGTGAGTCG CAAGAATGCC 360 

AAGTACCTGC TCAAAGGAGA ATATGTGGGC AAGGTCTTCC GGGTCGATGC AGAGACAGGA 420 

GACGTGTTCG CCATTGAGAG GCTGGACCGG GAGAATATCT CAGAGTACCA CCTCACTGCT 480 

GTCATTGTGG ACAAGGACAC TGGTGAAAAC CTGGAGACTC CTTCCAGCTT CACCATCAAA 540 

GTTCATGACG TGAACGACAA CTGGCCTGTG TTCAOGCATC GGTTGTTCAA TGCGTCCGTG 600 

CCTGAGTCGT CGGCTGTGGG GACCTCAGTC ATCTCTGTGA CAGCAGTGGA TGCAGACGAC 660 

CCCACTGTGG GAGACCACGC CTCTGTCATG TACCAAATCC TGAAGGGGAA AGAGTATTTT 720 

GCCATCGATA ATTCTGGACG TATTATCACA ATAACGAAAA GCTTGGACCG AGAGAAGCAG 780 

GCCAGGTATG AGATCGTGGT GGAAGCGCGA GATGCCCAGG GCCTCCGGGG GGACTCGGGC 840 

ACGGCCACCG TGCTGGTCAC TCTGCAAGAC ATCAATGACA ACTTCCCCTT CTTCACCCAG 900 

ACCAAGTACA CATTTGTCGT GCCTGAAGAC ACCCGTGTGG GCACCTCTGT GGGCTCTCTG 960 

TTTGTTGAGG ACCCAGATGA GCCCCAGAAC CGGATGACCA AGTACAGCAT CTTGCGGGGC 1020 

GACTACCAGG ACGCITTCAC CATTGAGACA AACCCCGCCC ACAACGAGGG CATCATCAAG 1080 

CCCATGAAGC CTCTGGATTA TGAATACATC CAGCAATACA GCTTCATCGT CGAGGCCACA 1140 

GACCCCACCA TCGACCTCCG ATACATGAGC CCTCCCGCGG GAAACAGAGC CCAGGTCATT 1200 

ATCAACATCA CAGATGTGGA CGAGCCCCCC ATTTTCCAGC AGCCTTTCTA CCACTTCCAG 1260 

CTGAAGGAAA ACCAGAAGAA GCCTCTGATT GGCACAGTGC TGGCCATGGA CCCTGATGCG 1320 

GCTAGGCATA GCATTGGATA CTCCATCCGC AGGACCAGTG ACAAGGGCCA GTTCTTCCGA 1380 

GTCACAAAAA AGGGGGACAT TTACAATGAG AAAGAACTGG ACAGAGAAGT CTACCCCTGG 1440 

TATAACCTGA CTGTGGAGGC CAAAGAACTG GATTCCACTG GAACCCCCAC AGGAAAAGAA 1500 

TCCATTGTGC AAGTCCACAT TGAAGTTTTG GATGAGAATG ACAATGCCCC GGAGTTTGCC 1560 

AAGCCCTACC AGCCCAAAGT GTGTGAGAAC GCTGTCCATG GCCAGCTGGT CCTGCAGATC 1620 

TCCGCAATAG ACAAGGACAT AACACCACGA AACGTGAAGT TCAAATTCAC CTTGAATACT 1680 

GAGAACAACT TTACCCTCAC GGATAATCAC GATAACACGG CCAACATCAC AGTCAAGTAT 1740 

GGGCAGTTTG ACCGGGAGCA TACCAAGGTC CACTTCCTAC CCGTGGTCAT CTCAGACAAT 1800 

GGGATGCCAA GTOGCACGGG CACCAGCACG CTGACCGTGG CCGTGTGCAA GTGCAACGAG 1860 

CAGGGCGAGT TCACCTTCTG CGAGGATATG GCCGCCCAGG TGGGCGTGAG CATCCAGGCA 1920 

GTGGTAGCCA TCTTACTCTG CATCCTCACC ATCACAGTGA TCACCCTGCT CATCTTCCTG 1980 

CGGCGGCGGC TCCGGAAGCA GGCCCGCGCG CACGGCAAGA GCGTGCCGGA GATCCACGAG 2040 

CAGCTGGTCA CCTACGACGA GGAGGGCGGC GGCGAGATGG ACACCACCAG CTACGATGTG 2100 

TCGGTGCTCA ACTCGGTGCG CCGCGGCGGG GCCAAGCCCC CGCGGCCCGC GCTGGACGCC 2160 

CGGCCTTCCC TCTATGCGCA GGTGCAGAAG CCACCGAGGC ACGCGCCTGG GGCACACGGA 2220 

GGGCCCGGGG AGATGGCAGC CATGATCGAG GTGAAGAAGG ACGAGGCGGA CCACGACGGC 2280 

GACGGCCCCC CCTACGACAC GCTGCACATC TACGGCTACG AGGGCTCCGA GTCCATAGCC 2340 

GAGTCCCTCA GCTCCCTGGG CACCGACTCA TCCGACTCTG ACGTGGATTA CGACTTCCTT 2400 

AACGACTGGG GACCCAGGTT TAAGATGCTG GCTGAGCTGT ACGGCTCGGA CCCCCGGGAG 2460 

GAGCTGCTGT ATTAGGCGGC CGAGGTCACT CTGGGCCTGG GGACCCAAAC CCCCTGCAGC 2520 

CCAGGCCAGT CAGACGCCAG GCACCACAGC CTCCAAAAAT GGCAGTGACT CCCCAGCCCA 2580 

GCACCCCTTC CTCGTGGGTC CCAGAGACCT CATCAGCCTT GGGATAGCAA ACTCCAGGTT 2640 

CCTGAAATAT CCAGGAATAT ATGTCAGTGA TGACTATTCT CAAATGCTGG CAAATCCAGG 2700 

CTGGTGTTCT GTCTGGGCTC AGACATCCAC ATAACCCTGT CACCCACAGA CCGCOGTCTA 2760 

ACTCAAAGAC TTCCTCTGGC TCCCCAAGGC TGCAAAGCAA AACAGACTGT GTTTAACTGC 2820 

TGCAGGGTCT TTTTCTAGGG TCCCTGAACG CCCTGGTAAG GCTGGTGAGG TCCTGGTGCC 2880 

TATCTGCCTG GAGGCAAAGG CCTGGACAGC TTGACTTGTG GGGCAGGATT CTCTGCAGCC 2940 

CATTCCCAAG GGAGACTGAC CATCATGCCC TCTCTCGGGA GCCCTAGCCC TGCTCCAACT 3000 
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CCATACTCCA CTCCAAGTGC CCCACCACTC 
AGGAAGGGGC CCCATGGCAG CTCCTGACCT 
ATGCCAGTAA CTGTGCTGTA CTGAGCACTG 
TTTGAAGCAA CTGTGAATTC ATTCTGGAGG 
GGGTGA6GGC CACCTCCACA CCCACCCCCT 
GCTTTGAGAC TCCTCAGCAC CCCTCCAGTT 
CAGAAGACGT CTCCCCTTCT CTGCCTCACC 
TCTGTCTACT CCTTATCCCT TGGTTTAGAG 
GACAATGTCC AAACCCACTC ATGACTGCAT 
CGCTGTTGTC ACATCTCAGG GAACTGACCC 
GCCCTGCCCA ACCTCTGTGG TCACCCATGC 
CACCTTGGAG AAGTGGCATC AGTCAACAGA 
CCTTCGTCAT GGACCGAGGT TCCCACTCTG 
AGATAACACT GACTTGTTTG TTTTAACCAA 
AATGATACTT ACAAGTTTCT AGCTCTCACA 
AGCAGGTTGT TATTTAGGTT AACAATATTA 
CTGTAACCTT CTATTTTCTA TAATTGTAGT 
GCCAAACTGG TGCATGACAA GTACTGTATT 
. GCCTGGGCAA CAAAAAAA 



CCCAACCCCT CTCCAGGCCT GTCAAGAGGG 3060 

TGGGTCCTGA AGTGACCTCA CTGGCCTGCC 3120 

AACCACATTC AGGGAAATGG CTTATTAAAC 3180 

GGCAGTGGAG ATCAGGAGTG ACAGATCACA 3240 

CTGGAGAAGG CCTGGAAGAG CTGAGACCTT 3300 

TTGCCTGAGA AGGGGCAGAT GTTCCCGGAG 3360 

TGGTCGCCAA TCCATGCTCT CTTTCTTTTC 3420 

GAACCCAAGA TGTGGCCTTT AGCAAAACTG 3480 

GACGGAGCCG AGCCATGTGT CTTTACACCT 3540 

TCAGGCACAC CTTGCAGAAG GCAAGGCCCT 3600 

ATCTTCCACT GGAACGTTTC ACTGCAAACA 3660 

GAGGGGCAGG GAAGGAGACA CCAAGCTCAC 3720 

GGCAAAGCCC CTCACACTGC AAGGGATTGT 3780 

TAACTAGCTT CTTATAATGA TTTTTTTACT 3840 

GACATATAGA ATAAGGGTTT TTGCATAATA 3900 

ATTCAGGTTT TTTAGTTGGA AAAACAATTC 3960 

AATTGCTCTA CAGATAATGT CTATATATTG 4020 

TTTTTATACC 7AAATAAAGA AAAATCTTTA 4080 



Seq ID No: 105 Protein sequence: 
Protein Accession #: NP_001786.1 



1 11 21 31 41 51 

I I I ] I I 

MQRLMMLIAT SGACLGLLAV AAVAAAGANP AQRDTHSLLP THRRQKRDWI WNQMHIDEEK 60 

NTSLPHHVGK IKSSVSRKNA KYLLKGEYVG KVFRVDAETG DVPAIERLDR ENISEYHLTA 120 

VTVDKDTGEN LBTPSSFTIK VHDVNDNWPV FTHRLFNASV PESSAVGTSV ISVTAVDADD 180 

PTVGDHASVM YQILKGKEYF AIDNSGRIIT ITKSLDREKQ ARYEIWEAR DAQGLRGDSG 240 

TATVLVTLQD INDNPPFPTQ TKYTFWPED TRVGTSVGSL FVEDPDEPQN RMTKYSILRG 300 

DYQDAFTIET NPAHNEGIIK PMKPLDYEYI QQYSFIVEAT DPTIDLRYMS PPAGNRAQVI 360 

INITDVDEPP IFQQPPYHFQ LKENQKKPLI GTVLAMDPDA ARHSIGYSIR RTSDKGQFFR 420 

VTKKGDIYNE KELDREVYPW YNLTVEAKEL DSTGTPTGKE SIVQVHIEVL DENDNAPEFA 480 

KPYQPKVCEN AVHGQLVLQI SAIDKDITPR NVKFKFTLNT ENNFTLTDNH DNTANITVKY 540 

GQFDREHTKV HFLPWISDN GMPSRTGTST LTVAVCKCNE QGEFTFCEDM AAQVGVSIQA 600 

WAILLCILT ITVTTLLIFL RRRLRKQARA HGKSVPEIHE QLVTYDEEGG GEMDTTSYDV 660 

SVLNSVRRGG AKPPRPALDA RPSLYAQVQK PPRHAPGAHG GPGEMAAMIE VKKDEADHDG 720 

DGPPYDTLHI YGYEGSESIA ESLSSLGTDS SDSDVDYDFL NDWGPRFKML AELYGSDPRE 780 
ELLY 



Seq ID NO: 106 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-474 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

ACAGTACTCT GTGCAAAAAA CCTGGTGAAA 
GCTAAGGTGG TGGTTGATGG ATCTGGGCAA 
CTTGATCCAA AGTGGAATCA GCATTATGAC 
ATCAGTGTAT GGAATCACAA GAAGATCCAT 
GTTCGTCTTC TTTCCAATGC CATCAACCGC 
TTATGCAAAC TCGGGCCAAA TGACAATGAT 
CAGTCCAGAG ACCGAATAGG CACAGGAGGA 
AACGATTTAC CAGACGGAGC TCATTATTTG 
GAAGGTAAAC ACCCGGTTAA AACACTGTAC 
GAAAGCTGTG GAGTTTTTTG ATGAAGAGCG 
ATCCTCTCGA GTGCCTCTGC AGGGCTTCAA 
CTTTACCATA CACCAGATTG ATGCCTGCAC 
CAATCGAATA GACATTCCAC CCTATGAAAG 
AGCCATTGAA GAAACATGTG GATTTGCTGT 
AC 



31 41 SI 

I I I 

AAGGATTTTT TCCGACTTCC TGATCCATTT 60 

TGCCATTCTA CAGATACTGT GAAGAATACG 120 

CTGTATATTG GAAAGTCTGA TTCAGTTACG 180 

AAGAAACAAG GTGCTGGATT TCTCGGTTGT 240 

CTCAAAGACA CTGGTTATCA GAGGTTGGAT 300 

ACAGTTAGAG GACAGATAGT AGTAAGTCTT 360 

CAAGTTGTGG ACTGCAGTCG TTTATTTGAT 420 

TGGACTTGGA AAGATAGATG TTAA TGACTG 480 

ACCAGACAGC AACATTGTCA AATGGTTCTG 540 

ACGAGCAAGA TTGCTTCAGT TTGTGACAGG 600 

AGCATTGCAA GGTGCTGCAG GCCCGAGACT 660 

TAACAACCTG CCGAAAGCCC ACACTTGCTT 720 

CTATGAAAAG CTATATGAAA AGCTGCTAAC 780 

GGAATGACAA GCTTCAAGGA TTTACCCAGG 840 



Seq ID No: 107 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

TVLCAKNLVK KDFFRLPDPF AKVWDGSGQ CRSTDTVKNT LDPKWNQHYD LYIGKSDSVT 60 
ISVWNHKKIH KKQGAGFLGC VRLLSNAINR LKDTGYQRLD LCKLGPNDND TVRGQIWSL 120 
QSRDRIGTGG QWDCSRLFD NDIiPDGAHYL WTWKDRC 

Seq ID NO: 108 DNA sequence 

Nucleic Acid Accession th NM_002318.1 

Coding sequence: 248-2572 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

c I I I I .1 I 

J ACTCCAGCGC GCGGCTACCT ACGCTTGGTG CTTGCTTTCT CCAGCCATCG GAGACCAGAG 60 

CCGCCCCCTC TGCTCGAGAA AGGGGCTCAG CGGCGGCGGA AGCGGAGGGG GACCACCGTG 120 

GAGAGCGCGG TCCCAGCCCG GCCACTGCGG ATCCCTGAAA CCAAAAAGCT CCTGCTGCTT 180 

CTGTACCCCG CCTGTCCCTC CCAGCTGCGC AGGGCCCCTT CGTGGGATCA TCAGCCCGAA 240 

GACAGGGATG GAGAGGCCTC TGTGCTCCCA CCTCTGCAGC TGCCTGGCTA TGCTGGCCCT 300 

10 CCTGTCCCCC CTGAGCCTGG CACAGTATGA CAGCTGGCCC CATTACCCCG AGTACTTCCA 360 

GCAACCGGCT CCTGAGTATC ACCAGCCCCA GGCCCCCGCC AACGTGGCCA AGATTCAGCT 420 

GCGCCTGGCT GGGCAGAAGA GGAAGCACAG CGAGGGCCGG GTGGAGGTGT ACTATGATGG 480 

CCAGTGGGGC ACCGTGTGCG ATGACGACTT CTCCATCCAC GCTGCCCACG TCGTCTGCCG 540 

GGAGCTGGGC TATGTGGAGG CCAAGTCCTG GACTGCCAGC TCCTCCTACG GCAAGGGAGA 600 

15 AGGGCCCATC TGGTTAGACA ATCTCCACTG TACTGGCAAC GAGGCGACCC TTGCAGCATG 660 

CACCTCCAAT GGCTGGGGCG TCACTGACTG CAAGCACACG GAGGATGTCG GTGTGGTGTG 720 

CAGCGACAAA AGGATTCCTG GGTTCAAATT TGACAATTCG TTGATCAACC AGATAGAGAA 780 

CCTGAATATC CAGGTGGAGG ACATTCGGAT TCGAGCCATC CTCTCAACCT ACCGCAAGCG 840 

CACCCCAGTG ATGGAGGGCT ACGTGGAGGT GAAGGAGGGC AAGACCTGGA AGCAGATCTG 900 

20 TGACAAGCAC TGGACGGCCA AGAATTCCCG CGTGGTCTGC GGCATGTTTG GCTTCCCTGG 960 

GGAGAGGACA TACAATACCA AAGTGTACAA AATGTTTGCC TCACGGAGGA AGCAGCGCTA 1020 

CTGGCCATTC TCCATGGACT GCACCGGCAC AGAGGCCCAC ATCTCCAGCT GCAAGCTGGG 1080 

CCCCCAGGTG TCACTGGACC CCATGAAGAA TGTCACCTGC GAGAATGGGC TGCCGGCCGT 1140 

GGTGAGTTGT GTGCCTGGGC AGGTCTTCAG CCCTGACGGA CCCTCGAGAT TCCGGAAAGC 1200 

25 ATACAAGCCA GAGCAACCCC TGGTGCGACT GAGAGGCGGT GCCTACATOG GGGAGGGCCG 1260 

CGTGGAGGTG CTCAAAAATG GAGAATGGGG GACCGTCTGC GACGACAAGT GGGACCTGGT 1320 

GTCGGCCAGT GTGGTCTGCA GAGAGCTGGG CTTTGGGAGT GCCAAAGAGG CAGTCACTGG 1380 

CTCCCGACTG GGGCAAGGGA TCGGACCCAT CCACCTCAAC GAGATCCAGT GCACAGGCAA 1440 

TGAGAAGTCC ATTATAGACT GCAAGTTCAA TGCCGAGTCT CAGGGCTGCA ACCACGAGGA 1500 

30 GGATG CTGGT GTGAGATGCA ACACCCCTGC CATGGGCTTG CAGAAGAAGC TGCGCCTGAA 1560 

CGGCGGCCGC AATCCCTACG AGGGCCGAGT GGAGGTGCTG GTGGAGAGAA ACGGGTCCCT 1620 

TGTGTGGGGG ATGGTGTGTG GCCAAAACTG GGGCATCGTG GAGGCCATGG TGGTCTGCCG 1680 

CCAGCTGGGC CTGGGATTCG CCAGCAACGC CTTCCAGGAG ACCTGGTATT GGCACGGAGA 1740 

TGTCAACAGC AACAAAGTGG TCATGAGTGG AGTGAAGTGC TCGGGAACGG AGCTGTCCCT 1800 

35 GGCGCACTGC CGCCACGACG GGGAGGACGT GGCCTGCCCC CAGGGGGGAG TGCAGTACGG 1860 

GGCCGGAGTT GCCTGCTCAG AAACCGCCCC TGACCTGGTC CTCAATGCGG AGATGGTGCA 1920 

GCAGACCACC TACCTGGAGG ACCGGCCCAT GTTCATGCTG CAGTGTGCCA TGGAGGAGAA 1980 

CTGCCTCTCG GCCTCAGCCG CGCAGACCGA CCCCACCACG GGCTACCGCC GGCTCCTGCG 2040 

CTTCTCCTCC CAGATCCACA ACAATGGCCA GTCCGACTTC CGGCCCAAGA ACGGCCGCCA 2100 

40 CGCGTGGATC TGGCACGACT GTCACAGGCA CTACCACAGC ATGGAGGTGT TCACCCACTA 2160 

TGACCTGCTG AACCTCAATG GCACCAAGGT GGCAGAGGGC CACAAGGCCA GCTTCTGCTT 2220 

GGAGGACACA GAATGTGAAG GAGACATCCA GAAGAATTAC GAGTGTGCCA ACTTCGGCGA 2280 

TCAGGGCATC ACCATGGGCT GCTGGGACAT GTACCGCCAT GACATCGACT GCCAGTGGGT 2340 

TGACATCACT GACGTGCCCC CTGGAGACTA CCTGTTCCAG GTTGTTATTA ACCCCAACTT 2400 

45 CGAGGTTGCA GAATCCGATT ACTCCAACAA CATCATGAAA TGCAGGAGCC GCTATGACGG 2460 

CCACCGCATC TGGATGTACA ACTGCCACAT AGGTGGTTCC TTCAGCGAAG AGACGGAAAA 2520 

AAAGTTTGAG CACTTCAGCG GGCTCTTAAA CAACCAGCTG TCCCCGCA GT AAA GAAGCCT 2580 

GCGTGGTCAA CTCCTGTCTT CAGGCCACAC CACATCTTCC ATGGGACTTC CCCCCAACAA 2640 

CTGAGTCTGA ACGAATGCCA CGTGCCCTCA CCCAGCCCGG CCCCCACCCT GTCCAGACCC 2700 

50 CTACAGCTGT GTCTAAGCTC AGGAGGAAAG GGACCCTCCC ATCATTCATG GGGGGCTGCT 2760 

ACCTGACCCT TGGGGCCTGA GAAGGCCTTG GGGGGGTGGG GTTTGTCCAC AGAGCTGCTG 2820 

GAGCAGCACC AAGAGCCAGT CTTGACCGGG ATGAGGCCCA CAGACAGGTT GTCATCAGCT 2880 

TGTCCCATTC AAGCCACCGA GCTCACCACA GACACAGTGG AGCCGCGCTC TTCTCCAGTG 2940 

ACACGTGGAC AAATGCGGGC TCATCAGCCC CCCCAGAGAG GGTCAGGCCG AACCCCATTT 3000 

55 CTCCTCCTCT TAGGTCATTT TCAGCAAACT TGAATATCTA GACCTCTCTT CCAATGAAAC 3060 

CCTCCAGTCT ATTATAGTCA CATAGATAAT GGTGCCACGT GTTTTCTGAT TTGGTGAGCT 3120 

CAGACTTGGT GCTTCCCTCT CCACAACCCC CACCCCTTGT TTTTCAAGAT ACTATTATTA 3180 

TATTTTCACA GACTTTTGAA GCACAAATTT ATTGGCATTT AATATTGGAC ATCTGGGCCC 3240 

TTGGAAGTAC AAATCTAAGG AAAAACCAAC CCACTGTGTA AGTGACTCAT CTTCCTGTTG 3300 

60 TTCCAATTCT GTGGGTTTTT GATTCAACGG TGCTATAACC AGGGTCCTGG GTGACAGGGC 3360 

GCTCACTGAG CACCATGTGT CATCACAGAC ACTTACACAT ACTTGAAACT TGGAATAAAA 3420 
GAAAGATTTA TG 

Seq ID No: 109 protein sequence ? 
05 Protein Accession #: NP_002309.1 

1 11 21 31 41 51 

- ft I I I I I I 

/U MSRPLCSHIiC SCLAMLALLS PLSLAQYDSW PHYPEYFQQP APEYHQPQAP ANVAKIQLRL 60 

AGQKRKH5BG RVEVYYDGQW GTVCDDDFSI HAAHWCREL GYVEAKSWTA SSSYGKGEGP 120 

IWLDNLHCTG NEATLAACTS NGWGVTDCKH TEDVGWCSD KRIPGFKFDN SLINQIENLN 180 

IQVEDIRIRA ILSTYRKRTP VMBGYVBVKB GKTWKQICDK HWTAKNSRW CGMFGPPGER 240 

TYKTKVYKMF ASRRKQRYWP FSMDCTGTEA HISSCKLGPQ VSLDPMKNVT CENGLPAWS 300 

75 CVPGQVFSPD GPSRFRKAYK PEQPLVRLRG GAYIGEGRVE VLKNGEWGTV CDDKWDLVSA 360 

SWCRELGFG SAKKAVTGSR LGQGIGPIHL NEIQCTGNEK SIIDCKFNAE SQGCNHEEDA 420 
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GVRCNTPAMG LQKKLRENGG RNPYEGRVEV LVERNGSLVW GMVCGQNWGI VEAMWCRQL 480 

GLGFASNAFQ ETWYWHGDVN SNKWMSGVK CSGTELSLAH CRHDGEDVAC PQGGVQYGAG 540 

VACSBTAPDIj VLNAEMVQQT TYLEDRPMFM LQCAMEENCL SASAAQTDPT TGYRRLLRFS 600 

SQIHNNGQSD FRPKNGRHAW IWHDCHRHYH SMBVFTHYDL LNLNGTKVAE GHKASFCLED 660 

TECEGDIQKN YECANFGDQG ITMGCWDMYR HDIDCQWVDI TDVPPGDYLF QWINPNFEV 720 
A2SDYSNNIM KCRSRYDGHR IWMYNCHIGG SFSEBTBKKF EHFSGLLNNQ LSPQ 

Seq ID NO: 110 DNA sequence 

Nucleic Acid Accession #: none found, CAT_73007_3 

Coding sequence: 1-495 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I. I I I I 

CGGACGCGTG GGTCGACCCA CGCGTCCGCC CACGCGTCCG TATGGACAGA GCCTCCACTG 60 

GCTGCTGCCT GCCCGCCACA TACCCAGCTG ACATGGGCAC CGCAGGAGCC ATGCAGCTGT 120 

CTGGGTGATC CTGGGCTTCC TCCTGTTCCG AGGCCACAAC TCCCAGCCCA CAATGACCCA 180 

ACCTCTAGCT CTCAGGGAGG CCTTGGCGGT CTAAGTCTGA CCACAGAGCC AGTTTCTTCC 240 

ACCCAGGATA CATCCCTTCC TCAGAGGCTA ACAGGCCAAG CCATCTGTCC AGCACTGGTA 300 

CCCAGGCGCA GGTGTCCCCA GCAGTGGAAG AGACGGAGGC ACAAGCAGAG ACACATTTCA 360 

ACTGTTCCCC CCAATTCAAC CACCATGAGC CTGAGCATGA GGGAAGATGC GACCATCCTG 420 

CCAGCCCCAC GTCAGAGACT GTGCTCACTG TGGCTGCATT TGGGATGGAG TCGGGTGGAG 480 

GCCCACTCTG GCTAGGGGGC GGCAGGCTGA GAGCTCACCT GTTCAGCAGA GAAGTGGAAC 540 

CACTTTGCTC CTGGAGCCTG TCTACCACAG TGTTATCAGC TTCATTGTCA TCCTGGTGGT 600 

GTGGTGATCA TCCTAGTTGG TGTGGTCAGC CTGAGGGTTC AGTGTCGGAA GAGCAAGGAG 660 

TCTGAAGATC CCAGAACCTG GGAGTACAGG GCGTGTCTGA CAAGCTGGTC ACAGACCATG 720 

GCGAGAACGA CAGCATCGCC CATTATCACA TGGAAGACAT CACACGACTT AGGGCAACAC 780 

GCACTCAGCA GCGAGCATCA AAGGAGCCTA CGCATGGCCC AGACTGAGAG CAAGCACAAA 840 
GGGC 

Seq ID No: 111 Protein sequence: 

Protein Accession #: none found, CAT_73007_3 



1 11 21 31 41 51 

I I I I I I 

RTRGSTHASA HASVWTEPPL AAACPPHTQL TWAPQEPCSC LGDPGLPPVP RPQLPAHNDP 60 
TSSSQGGLGG LSLTTEPVSS TQDTSLPQRL TGQAICPALV PRRRCPQOWK RRRHKQRHIS 120 
TVPPNSTTMS LSMREDATIL PAPRQRLCSL WLHU3WSRVE AHSG 



Seq ID NO: 112 DNA sequence 

Nucleic Acid Accession #: NM_0 054 24.1 

Coding sequence: 37-34S3 (underlined sequences correspond to start and stop codons) 



CGCTCGTCCT 
TTGCTCCCCA 
GCCAACCTGC 
GGGGCGGGGA 
ATCGTGCGCA 
ACGCTTCGCG 
GCTGGGGCGC 
CCAGACAAGG 
CACAAGGAGA 
GACTGGCATG 
TCGAGCGGCA 
CGGCTCATCG 
CCAGGTTGCC 
GGCTTCACTG 
CAGGAGCAGT 
TATGGCTGCT 
GGTCATTTTG 
CGGTTCAGTG 
CGGATCCCCC 
CGGATCAACT 
AAGCCAGACG 
GCTGAGTTCG 
TCCACATCTG 
CCCCTGGCTG 
GTCTCGTTCT 
AGTACCATGG 
CTGAGGCCAA 
GAGGGGGCCT 
CCGTGGTTGG 
CCCTTGGTGC 



11 
I 

GGCTGGCCTG 
TCCTCTTCTT 
GGCTCACGGA 
GGGGCTCGGA 
CCCCGCCCGG 
GCTTCTCCAA 
GGCGCACGCG 
TCACACACAC 
AGCAGACAGA 
AAGCCCAGGA 
TCTACAGTGC 
TGCGGGGTTG 
TACATGGAGG 
GCACCCGCTG 
GCCCAGGCAT 
CTTGTGGATC 
GGGCTGATTG 
GTTGTGTCTG 
AGATCCTCAA 
GTGCAGCTGC 
GCACTGTGCT 
AGGTGCCCCG 
GCGGCCAAGA 
CACCTCGGCT 
CTGGGGATGG 
ACTGGTCGAC 
AGACAGGATA 
GGGGGCCTCC 
AGGGCTGGCA 
CCGGGCCACT 



21 
I 

GGTCGGCCTC 
GGCTTCTCAT 
CCCCCAGCGC 
CGCCTGGGGC 
GCCACCCCTG 
GCCCTCGGAC 
CGTCATCTAC 
TGTGAACAAA 
CGTGATCTGG 
TGGGCGGTTC 
CACTTACCTG 
TGGGGCTGGG 
TGTCTGCCAC 
TGAACAGGCC 
ATCAGGCTGC 
TGGCTGGAGA 
CCGACTCCAG 
CCCCTCTGGG 
CATGGCCTCA 
AGGGAACCCC 
CCTGTCCACC 
CTTGGTTCTT 
CAGCCGGCGC 
CCTGACCAAG 
ACCCATCTCC 
CATTGTGGTG 
CAGTGTTCGT 
CACCCTCATG 
TGTGGAAGGC 
GGTGGGCGAC 



31 
I 

TGGAGTATGG 
GTGGGCGCGG 
TTCTTCCTGA 
CCGCCCCTGC 
CGCCTGGCGC 
CTCGTGGGCG 
GTGCACAACA 
GGTGACACCG 
AAGAGCAACG 
CTGCTGCAGC 
GAAGCCAGCC 
CGCTGGGGGC 
GACCATGACG 
TGCAGAGAGG 
CGGGGCCTCA 
GGAAGCCAGT 
TGCCAGTGTC 
TGGCATGGAG 
GAACTGGAGT 
TTCCCCGTGC 
AAGGCCATTG 
GCGGACAGTG 
TTCAAGGTCA 
CAGAGCCGCC 
ACTGTCCGCC 
GACCCCAGTG 
GTGCAGCTGA 
ACCACAGACT 
ACTGACCGGC 
GGTTTCCTGC 



41 
I 

TCTGGCGGGT 
CGGTGGACCT 
CTTGCGTGTC 
TGCTGGAGAA 
GCAACGGTTC 
TCTTCTCCTG 
GCCCTGGAGC 
CTGTACTTTC 
GATCCTACTT 
TCCCAAATGT 
CCCTGGGCAG 
CAGGCTGTAC 
GCGAATGTGT 
GCCGTTTTGG 
CCTTCTGCCT 
GCCAAGAAGC 
AGAATGGTGG 
TGCACTGTGA 
TCAACTTAGA 
GGGGCAGCAT 
TGGAGCCAGA 
GGTTCTGGGA 
ATGTGAAAGT 
AGCTTG3X3GT 
TGCACTACCG 
AGAACGTGAC 
GCCGGCCAGG 
GTCCTGAGCC 
TGCGAGTGAG 
TGCGCCTGTG 



51 
I 

GCCCCCTTTC 
GACGCTGCTG 
TGGGGAGGCC 
GGACGACCGT 
GCACCAGGTC 
CGTGGGCGGT 
CCACCTGCTT 
TGCACGTGTG 
CTACACCCTG 
GCAGCCACCA 
CGCCTTCTTT 
CAAGGAGTGC 
ATGCCCCCCT 
GCAGAGCTGC 
CCCAGACCCC 
TTGTGCCCCT 
CACTTGTGAC 
GAAGTCAGAC 
GACGATGCCC 
AGAGCTACGC 
GAAGACCACA 
GTGCCGTGTG 
GCCCCCCGTG 
CTCCCCGCTG 
GCCCCAGGAC 
GTTAATGAAC 
GGAAGGAGGA 
TTTGTTGCAG 
CTGGTCCTTG 
GGACGGGACA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
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10 



15 



20 



25 



30 



35 



CGGGGGCAGG 
GGACTCACGC 
GGCCCGGCCT 
CACCTCCACG 
GCTCTGCCTG 
GACCCACTGT 
AACGCCAGCA 
AGCAACACAG 
GAGAGCCGGG 
GTGTCTGCCA 
AGCTGCCTGC 
CTGCAGTTCA 
CTGAGCTACC 
AACTTCGGCC 
ATCAAAATGC 
GAAGTTCTGT 
AACCGAGGTT 
CTGCGGAAAA 
TCTACCCTTA 
TACCTGAGTG 
GAGAACCTAG 
AAGAAGACGA 
GTCTATACCA 
CTTGGAGGTA 
GGCTACCGCA 
TGCTGGCGGG 
CGCATGCTGG 
GCGGGCATTG 
GCTGGCCGGA 
CTTAAGCTGC 
GGTGGGCTTA 
TCCTTCTTTC 
CAAACCCCCA 
CAGCTACTCC 



AGCGGCGGGA 
CTGGCACCCA 
CGCCCCCTGC 
CCCAGGCCCT 
GGCCAATATC 
GGATAGACGT 
CGCGCTACCT 
TAGAAGAGTC 
CAGCTGAAGA 
CCTGCCTCAC 
ATCGGAGACG 
GCTCAGGGAC 
CAGTGCTAGA 
AGGTCATCCG 
TGAAAGAGTA 
GCAAATTGGG 
ACTTGTATAT 
GCCGGGTCCT 
GCTCCCGGCA 
AGAAGCAGTT 
CCTCCAAGAT 
TGGGGCGTCT 
CCAAGAGTGA 
CACCCTACTG 
TGGAGCAGCC 
ACCGTCCCTA 
AAGCCAGGAA 
ATGCCACAGC 
GCAAACTCTG 
CTCAAGGAAT 
GGGGAACTGG 
TAGTTCAGCT 
CTCCAGCTCC 
CACTCCCGGC 



GAACGTCTCA 
CTACCAGCTG 
ACAOGTGCTT 
CTCAGACTCC 
CAAGTACGTT 
GGACAGGCCT 
CTTCCGCATG 
CACCCTGGGC 
GGGCCTGGAT 
CATCCTGGCC 
CACCTTCACC 
CTTGACACTT 
GTGGGAGGAC 
GGCCATGATC 
TGCCTCTGAA 
GCATCACCCC 
CGCTATTGAA 
AGAGACTGAC 
GCTGCTGCGT 
CATCCACAGG 
TGCAGACTTC 
CCCTGTGCGC 
TGTCTGGTCC 
TGGCATGACC 
TCGAAACTGT 
TGAGCGACCC 
GGCCTATGTG 
TGAGGAGGCC 
CTGTCTAACC 
TTTTTTAACT 
GTTCCCATGC 
GCCCCACAGG 
TTCGCTXAAG 
CTGTCATTCA 



TCCCCCCAGG 
GATGTGCAGC 
CTGCCCCCCA 
GAGATCCAGC 
GTGGAGGTGC 
GAGGAGACAA 
CGGGCCAGCA 
AACGGGCTGC 
CAGCAGCTGA 
GCCCTTTTAA 
TACCAGTCAG 
ACCCGGCGGC 
ATCACCTTTG 
AAGAAGGACG 
AATGACCATC 
AACATCATCA 
TATGCCCCCT 
CCAGCTTTTG 
TTCGCCAGTG 
GACCTGGCTG 
GGCCTTTCTC 
TGGATGGCCA 
TTTGGAGTCC 
TGTGCCGAGC 
GACGATGAAG 
CCCTTTGCCC 
AACATGTCGC 
TGAGCTGCCA 
TGTGACCAGT 
TAAGGGAGAA 
TTTGTAGGTG 
TGTGTTTCCC 
CCAGCACTCA 
GAAAAAAATA 



CCCGCACTGC 
TCTACCACTG 
GTGGGCCTCC 
TGACATGGAA 
AGGTGGCTGG 
GCACCATCAT 
TTCAGGGGCT 
AGGCTGAGGG 
TCCTGGCGGT 
CCCTGGTGTG 
GCTCGGGCGA 
CAAAACTGCA 
AGGACCTCAT 
GGCTGAAGAT 
GTGACTTTGC 
ACCTCCTGGG 
ACGGGAACCT 
CTCGAGAGCA 
ATGCGGCCAA 
CCCGGAATGT 
GGGGAGAGGA 
TTGAGTCCCT 
TTCTTTGGGA 
TCTATGAAAA 
TGTACGAGCT 
AGATTGOGCT 
TGTTTGAGAA 
TCCAGCCAGA 
CTGACCCTTA 
AAAAAGGGAT 
TCTCATAGCT 
ATCCCACTGC 
CACCACTAAC 
AATGTTCTAA 



CCTCCTGACG 
CACCCTCCTG 
AGCCCCCCGA 
GCACCCGGAG 
GGGTGCAGGA 
CCGTGGCCTC 
CGGGGACTGG 
CCCAGTCCAA 
GGTGGGCTCC 
CATCCGCAGA 
GGAGACCATC 
GCCCGAGCCC 
CGGGGAGGGG 
GAACGCAGCC 
GGGAGAACTG 
GGCCTGTAAG 
GCTAGATTTT 
TGGGACAGCC 
TGGCATGCAG 
GCTGGTCGGA 
GGTTTATGTG 
GAACTACAGT 
GATAGTGAGC 
GCTGCCCCAG 
GATGCGTCAG 
ACAGCTAGGC 
CTTCACTTAC 
ACGTGGCTCT 
CAGCCTCTGA 
CTGGGGATGG 
ATCCTGGGCA 
TCCCCCAACA 
ATGCCCTGTT 
TAAGCTCCAA 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2260 
2340 
2400 
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2640 
2700 
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Seq ID No: 113 Protein sequence: 
Protein Accession #: NP 005415.1 
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50 



55 



60 



MVWRVPPFLL 
LLLEKDDRIV 
NSPGAHIiLPD 
QLPNVQPPSS 
DGECVCPPGF 
QCQEACAPGH 
EFNLETMPRI 
SGFWECRVST 
RLHYRPQDST 
DCPEPLLQPH 
QARTALLTGL 
QLiTWKHPEAXi 
SIQGUGDWSN 
LTLVCIRRSC 
FEDLIGSGNF 
INLLGACKNR 
SDAANGMQYL 
AIESLNYSVY 
EVYELMRQCW 



11 
I 

PILFIiASHVG 
RTPPGPPLRI* 
KVTHTVNKGD 
GIYSATYLEA 
TGTRCEQACR 
FGADCRLQCQ 
NCAAAGNPFP 
SGGQDSRRFK 
MDWSTIWDP 
LEGWHVEGTD 
TPGTHYQLDV 
PGPISKYWE 
TVEESTLGNG 
LHRRRTFTYQ 
GQ VTRAMIKK 
GYLYIAIEYA 
SEKQPIHRDL 
TTKSDVWSFG 
RDRPYERPPF 



21 
I 

AAVDLTLLAN 
ARNGSHQVTL 
TAVLSARVHK 
SPLGSAFFRL 
EGRPGQSCQE 
CQNGGTCDRF 
VRGSIELRKP 
VNVKVPPVPL 
SENVTLMNLR 
RLRVSWSLPL 
QLYHCTLLGP 
VQVAGGAGDP 
IiQAEGPVQES 
SGSGEETILQ 
DGLKMNAAIK 
PYGNLLDFLR 
AARNVLVGEN 
VLLWEIVSLG 
AQIALQLGRM 



31 
I 

LRLTDPQRFF 
RGFSKPSDLV 
EKQTDVIWKS 
IVRGCGAGRW 
QCPGISGCRG 
SGCVCPSGWH 
DGTVLiIiSTKA 
AAPRIiLTKQS 
PKTGYSVRVQ 
VPGPLVGDGF 
ASPPAHVLLP 
LWIDVDRPEE 
RAAEEGLDQQ 
PSSGTLTLTR 
MLKEYASEND 
KSRVLiETDPA 
LASKIADFGL 
GTPYCGMTCA 
LEARKAYVNM 



41 
I 

LTCVSGEAGA 
GVFSCVGGAG 
NGSYFYTLDW 
GPGCTKECPG 
LTFCLPDPYG 
GVHCEKSDRI 
IVEPEKTTAE 
RQLWSPLVS 
LSRPGEGGEG 
LLRLWDGTRG 
PSGPPAPRHL 
TSTIIRGLNA 
LILAWGSVS 
RPKLQPEPLS 
HRDFAGELEV 
FAREHGTAST 
SRGEEVYVKK 
ELYEKLPQGY 
SLFENFTYAG 



51 
I 

GRGSDAWGPP 
ARRTRVIYVH 
HEAQDGRFLL 
CLHGGVCHDH 
CSCGSGWRGS 
PQILNMASEL 
FEVPRLVLAD 
FSGDGPISTV 
AWGPPTLMTT 
QERRENVSSP 
HAQALSDSEI 
STRYLFRMRA 
ATCLTILAAL 
YPVLEWEDIT 
LCKLGHHPKI 
LSSRQLLRFA 
TMGRLPVRWM 
RKEQPRNCDD 
IDATAEEA 



60 
. 120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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70 



75 



Seq ID NO: 114 DNA sequence 

Nucleic Acid Accession #: NM_002632.1 

Coding sequence: 322-771 (underlined sequences correspond to start and stop codons) 



1 
I 

GGGATTCGGG 
TCCCCGGGAC 
CCCAGCCACA 
CCACOGGCCG 
CTCCTCCTGT 
TGGCTCGGGA 
CTGGCCGGGC 
GGCTCGTCAG 



11 

I 

CCGCCCAGCT 
CCGCCTGCCC 
GCCTTACCTA 
GGGCCTCGGG 
GCCAGGGGCT 
CGTCTGAGAA 
TGGCGCTGCC 
AGGTGGAAGT 



21 
I 

ACGGGAGGAC 
CTCGGCGCCC 
CGGGCTCCTG 
GCAGCAGTGA 
CCCCGGGGGA 
GATGCCGGTC 
TGCTGTGCCC 
GGTACCCTTC 



31 
I 

CTGGAGTGGC 
CGCCCCGCCG 
ACTCCGCAAG 
GGGAGGCGTC 
TGAGCATGGT 
ATGAGGCTGT 
CCCCAGCAGT 
CAGGAAGTGT 



41 
I 

ACTGGGCGCC 
GGCCGCTCCC 
GCTTCCAGAA 
CAGCCCCCCA 
GGTTTTCCCT 
TCCCTTGCTT 
GGGCCTTGTC 
GGGGCCGCAG 



51 
I 

CGACGGACCA 60 

CGTCGGGTTC 120 

GATGCTCGAA 180 

CTCAGCTCTT 240 

CGGAGCCCCC 300 

CCTGCAGCTC 360 

TGCTGGGAAC 420 

CTACTGCCGG 480 
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GCGCTGGAGA GGCTGGTGGA CGTCGTGTCC GAGTACCCCA GC6AG6T6GA GCACATGTTC 540 

AGCCCATCCT GTGTCTCCCT GCTGCGCTGC ACCGGCTGCT GCGGCGATGA GAATCTGCAC 600 

TGTGTGCCGG TGGAGACGGC CAATGTCACC ATGCAGCTCC TAAAGATCCG TTCTGGGGAC 660 

CGGCCCTCCT ACGTGGAGCT GACGTTCTCT CAGCACGTTC GCTGCGAATG CCGGCCTCTG 720 

5 CGGGAGAAGA TGAAGCCGGA AAGGTGCGGC GATGCTGTTC CCCGGAG GTA A CCCACCCCT 780 

TGGAGGAGAG AGACCCCGCA CCCGGCTCCT GTATTTATTA CCGTCACACT CTTCAGTGAC 840 

TCCTGCTGGT ACCTGCCCTC TATTTATTAG CCAACTGTTT CCCTGCTGAA TGCCTCGCTC 900 

CCTTCAAGAC GAGGGGCAGG GAAGGACAGG ACCCTCAGGA ATTCAGTGCC TTCAACAACG 960 

TGAGAGAAAG AGAGAAGCCA GCCACAGACC CCTGGGAGCT TCCGCTTTGA AAGAAGCAAG 1020 

10 ACACGTGGCC TCGTGAGGGG CAAGCTAGGC CCCAGAGGCC CTGGAGGTCT CCAGGGGCCT 1080 

GCAGAAGGAA AGAAGGGGGC CCTGCTACCT GTTCTTGGGC CTCAGGCTCT GCACAGACAA 1140 

GCAGCCCTTG CTTTCGGAGC TCCTGTCCAA AGTAGGGATG CGGATTCTGC TGGGGCCGCC 1200 

ACGGCCTGGT GGTGGGAAGG CCGGCAGCGG GCGGAGGGGA TTCAGCCACT TCCCCCTCTT 1260 

CTTCTGAAGA TCAGAACATT CAGCTCTGGA GAACAGTGGT TGCCTGGGGG CTTTTGCCAC 1320 

15 TCCTTGTCCC CCGTGATCTC CCCTCACACT TTGCCATTTG CTTGTACTGG GACATTGTTC 1380 

TTTCCGGCCG AGGTGCCACC ACCCTGCCCC CACTAAGAGA CACATACAGA GTGGGCCCCG 1440 

GGCTGGAGAA AGAGCTGCCT GGATGAGAAA CAGCTCAGCC AGTGGGGATG AGGTCACCAG 1500 

GGGAGGAGCC TGTGCGTCCC AGCTGAAGGC AGTGGCAGGG GAGCAGGTTC CCCAAGGGCC 1560 

CTGGCACCCC CACAAGCTGT CCCTGCAGGG CCATCTGACT GCCAAGCCAG ATTCTCTTGA 1620 

20 ATAAAGTATT CTAGTGTGGA AACGC . 



25 



Seq ID No: 115 Protein sequence: 
Protein Accession ft: NP 002623.1 
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MPVMRLFPCF LQLLAGLALP AVPPQQWALS AGNGSSEVEV VPFQBVWGRS YCRALERLVD 60 
WSEYPSEVE HMPSPSCVSL LRCTGCCGDE NLHCVPVETA NVTMQLLKIR SGDRPSYVBL 120 
TFSQHVRCEC RPLREKMKPE RCGDAVPRR 

Seq ID NO: 116 DNA sequence 

Nucleic Acid Accession ft: NM_007361.1 

Coding sequence: 1-4131 (underlined sequences correspond to start and stop codons) 



1 
I 

ATGGAGGGGG 
CTGCAGTTGC 
GAGTCGTGGT 
GCTGGCGAAT 
ACGGCATCAT 
CCACCGACTT 
GCCGAGTCCT 
TGCGCGCTGG 
CTGGGAGCAG 
AACACTTTCC 
CCTGCCAACG 
CAGCTTCCAG 
CCATATTTCA 
CTGGGGATCC 
AGGCCAGCTG 
TTCAGCCATG 
GTGAATGAGG 
AGCAGCATTG 
ACCTTGGATC 
AAAGGCCAAG 
GACAGAGATT 
ATCCAGCCCT 
CCTGAAGAAG 
CGAGGGACGT 
TATAATGCTG 
TTCTGCACGG 
GGGAAGCACT 
CTCCACGTGG 
GGCAATGATG 
CTCCTCCCCC 
GGCTCTGAGA 
TTCTACCCGG 
AACTACCTGA 
ACAGCCCACA 
ACAAGTTCCA 
ATCCACCAGA 
ACCACCCAGC 
CXTAGATTTG 
GTGAATCCTT 
ACAGGTGTAG 



11 
I 

ACCGGGTGGC 
TAATGTTGCG 
GGGACCAGCT 
CCCCTGCACT 
CTCCACTCAG 
CCCGGCCATC 
GTACCGAGAG 
CTTCCCGCGC 
GTAGGCGCTT 
AGGCAGTTTT 
GCCTGCAGTT 
CTCGGGTGGG 
GCTTGACTAG 
CTGGAGTGTG 
CAGTTGGAGA 
CTACAGCCCT 
AGGAAGCTGA 
ATGTTTCCTT 
CTCACACCAA 
TTGAGCCCTG 
CACTGGCTCC 
ACCCAGATGG 
AAATTGTTCT 
ATGAGGTGGG 
CCAACAAGGA 
ACTATGCCAC 
GTCTGCCTGA 
GCCATACACC 
GCAGAGCCTA 
TCACACCAAT 
ACGGCTTCAG 
GAGAGGAGAC 
GCATTAAGAC 
TCTCTCCCTA 
GAGACTACTC 
ACATCACTTA 
AGCTGAACGT 
CTGTGACCAA 
GCTATGATGG 
ATTACACCTG 



21 
I 

CGGGCGGCCG 
GGCCGCGGCG 
CCTGCAGGAA 
TCTTACGAAG 
GACTTCCCCA 
GCCCCTTTTC 
GACACCTCCC 
TCTGOGOGCT 
ACGAGGAGGT 
GGCATCTGAT 
CCTTGGAACC 
CTTCTGCCGA 
CACTGAACAG 
GGCTTTCCAT 



GGAAAGTGAC 
ATACCTTCCG 
CCAATCCAAA 
AGAAGGAACA 
GGATGAGAGA 
TTCCTGGGAA 
AGGGCCAGTG 
TCGAAGTTAC 
ACTGGAAGAC 
AACCTGTGAA 
TGGCTTCTGC 
GGGGGCACCT 
CGTGCACTTC 
CACGGCCATC 
TGGAGGCCTG 
CCTCGCAGGT 
GGTTCGTATC 
CAACATTCAA 
CAAGGAGCTG 
TCTGACTTTT 
CCAGGTGTGC 
GGACCGGGTC 
TCAAATTGGC 
GAGCCACATG 
TGAGTGCGCA 



31 
I 

GTGCTGTCGT 
CTGCACCCAG 
GGCGACGACG 
CCCGATTCAG 
GGGAAACGCA 
TGGCGGACAT 
CCGCAGTGCT 
TTTTACCCCC 
CAAACGCGGG 
GGGTCTGATA 
GGCCCCAAAG 
GGGGAGGCTG 
TCTGTGAAAA 
ATCGGCAGCA 
GCCCACTCTT 
TATAATGAGG 
GGTGAACCAG 
GTGGATACAA 
TCTCTGGGAG 
GAGACCAGAA 
ACCCCACCAC 
CCTTCGGAAA 
CCTGCTTCAG 
AACATAGGTT 
CACAACCACA 
TGCCACTGCC 
CACCGAGTGA 
ACTGATGTGG 
AGCCACATCC 
TTTGGCTGGC 
GCTGCCTTTA 
ACTCAAACTG 
GGCCAGGTGC 
TACCACTACT 
GGTGCAATCA 
AGGCACGCCC 
TTTGCCTTGT 
CCGGTCAAAG 
TGTGACACAA 
TCTGGGTACC 



41 

I 

CGTTACCAGT 
ACGAGCTCTT 
TAAAGCTCAG 
CAACCTCTAC 
GTATGTGGAC 
CGACACGAGC 
GGGCCTGGCC 
ACCCACGCCT 
CGCTGCCCTC 
GCTACGCCCT 
AGTCTTACAA 
ATGATCTGAA 
ATCTCTATCA 
CTTCCCCGTT 
CTGTTCCCCT 
ACAATTTGGA 
AGGAGGCATT 
AGCCTTTAGA 
AGGTAGGGGG 
GCCCAGCTCC 
CGTACCCCGA 
TGGATGTTCC 
GTCACACTAC 
CCAACACCGA 
GACAATGCTC 
AATCCAAGTT 
ATGGGAAAGT 
ACCTGCATGC 
CACAGCCAGC 
TCTTTGCTTT 
CCCATGACAT 
CTGAGGGACT 
CTTACGTCCC 
CCGACTCCAC 
ACCAAACATG 
CCAGACACCC 
ATAATGATGA 
AAGATTCAGA 
CAGCACGGTG 
AGGGAGATGG 



51 
I 

GCTACTGCTG 
CCCACACGGG 
CCGTGGTGAA 
GTGGGCACCA 
TATGATTTCC 
CACGGCAGAG 
GCCCGCTATG 
TCCTGGCCAC 
GGGAGAGCTG 
CTTTCTTTAT 
TGTCCAGCTT 
GTCAGAAGGA 
ACTAAGCAAC 
GGACAATGTC 
GGGACGTTCC 
TTACTACGAT 
GAATGGCCAC 
GGAATCTTCC 
CCCAGATTTA 
ACCAGAGGTA 
AAACGGAAGC 
CCCAGCTCAT 
ACCCTTAAGT 
GGTCTTCACG 
CCGGCATGCC 
TTATGGAAAT 
GAGTGGCCAC 
GTATATCGTG 
AGCCCAGGCC 
AGAAAAACCT 
GGAAGTTACA 
TGACCCAGAG 
AGCAAATTTC 
TGTGACCTCT 
GTCCTACCGC 
GTCCTTCCCC 
AGAAAGAGTG 
CCCCACTCCG 
CCATCCAGGG 
ACGGAACTGT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
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2400 
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GTGGATGAAA ATGAATGTGC AACTGGCTTT CATCGCTGTG GCCCCAACTC TGTATGTATC 2460 

AACTTGCCTG GAAGCTACAG GTGTGAGTGC CGGAGTGGTT ATGAGTTTGC AGATGACOGG 2520 

CATACTTGCA TCTTGATCAC CCCACCTGCC AACCCCTGTG AGGATGGCAG TCATACCTGT 2560 

GCTCCTGCTG GGCAGGCCCG GTGTGTTCAC CATGGAGGCA GCACGTTCAG CTGTGCCTGC 2640 

CTGCCTGGTT ATGCCGGCGA TGGGCACCAG TGCACTGATG TAGATGAATG CTCAGAAAAC 2700 

AGATGTCACC CTGCAGCTAC CTGCTACAAT ACTCCTGGTT CCTTCTCCTG CCGTTGTCAA 2760 

CCCGGATATT ATGGGGATGG ATTTCAGTGC ATACCTGACT CCACCTCAAG CCTGACACCC 2820 

TGTGAACAAC AGCAGCGCCA TGCCCAGGCC CAGTATGCCT ACCCTGGGGC CCGGTTCCAC 2B80 

ATCCCCCAAT GCGACGAGCA GGGCAACTTC CTGCCCCTAC AGTGTCATGG CAGCACTGGT 2940 

TTCTGCTGGT GCGTGGACCC TGATGGTCAT GAAGTTCCTG GTACCCAGAC TCCACCTGGC 3000 

TCCACCCOGC CTCACTGTGG ACCATCACCA GAGCCCACCC AGAGGCCCCC GACCATCTGT 3060 

GAGCGCTGGA GGGAAAACCT GCTGGAGCAC TACGGTGGCA CCCCCCGAGA TGACCAGTAC 3120 

GTGCCCCAGT GCGATGACCT GGGCCACTTC ATCCCCCTGC AGTGCCACGG AAAGAGCGAC 3180 

TTCTGCTGGT GTGTGGACAA AGATGGCAGA GAGGTGCAGG GCACCCGCTC CCAGCCAGGC 3240 

ACCACCCCTG CGTGTATACC CACCGTCGCT CCACCCATGG TCCGGCCCAC GCCCCGGCCA 3300 

GATGTGACCC CTCCATCTGT GGGCACCTTC CTGCTCTATA CTCAGGGCCA GCAGATTGGC 3360 

TACTTACCCC TCAATGGCAC CAGGCTTCAG AAGGATGCAG CTAAGACCCT GCTGTCTCTG 3420 

CATGGCTCCA TAATCGTGGG AATTGATTAC GACTGCCGGG AGAGGATGGT GTACTGGACA 3480 

GATGTTGCTG GACGGACAAT CAGCCGTGCC GGTCTGGAAC TGGGAGCAGA GCCTGAGACG 3540 

ATCGTGAATT CAGGTCTGAT AAGCCCTGAA GGACTTGCCA TAGACCACAT CCGCAGAACA 3600 

ATGTACTGGA CGGACAGTGT CCTGGATAAG ATAGAGAGCG CCCTGCTGGA TGGCTCTGAG 3660 

CGCAAGGTCC TCTTCTACAC AGATCTGGTG AATCCCCGTG CCATCGCTGT GGATCCAATC 3720 

CGAGGCAACT TGTACTGGAC AGACTGGAAT AGAGAAGCTC CTAAAATTGA AACGTCATCT 3780 

TTAGATGGAG AAAACAGAAG AATTCTGATC AATACAGACA TTGGATTGCC CAATGGCTTA 3840 

ACCTTTGACC CTTTCTCTAA ACTGCTCTGC TGGGCAGATG CAGGAACCAA AAAACTGGAG 3900 

TGTACACTAC CTGATGGAAC TGGACGGCGT GTCATTCAAA ACAACCTCAA GTACCCCTTC 3960 

AGCATCGTAA GCTATGCAGA TCACTTCTAC CACACAGACT GGAGGAGGGA TGGTGTTGTA 4020 

TCAGTAAATA AACATAGTGG CCAGTTTACT GATGAGTATC TCCCAGAACA ACGATCTCAC 4080 

CTCTACGGGA TAACTGCAGT CTACCCCTAC TGCCCAACAG GAAGAAA GTA AG TACAGTAA 4140 

TGTAAAGGAA GACTTGGAGT TTACAATCAG AACCTGGACC CTAAAGAACA GTGACTGCAA 4200 

AGGCAAAGAA AGTAAAAAAG GAATTGGCCA TTAGACGTTC CTGAGCATCC AAGATGAACA 4260 

TTTTGTAGTG CAAAAAGACT TTTGTGAAAA GCTGATACCT CAATCTTTAC TACTGTATTT 4320 

TTAAAAATGA AGGTTGTTAT TGCAAGTTTA AAAAGGTAAC AGAATTTTAA CTGTTGCTTA 4380 

TTAAAGCAAC TTCTTGTAAA CATTTATCAT TAATATTTAA AAGATCAAAT TCATTCAACT 4440 

AAGAATTAGA GTTTAAGACT CTAAACCTGA TTTTTGCCAT GGATTCCTTC TGGCCAAGAA 4500 

ATTAAAGCAC ATGTGATCAA TATAACAATA TAATCCTAAA CCTTGACAGT TGGAGAAGCC 4560 

AATGCAGAAC TGATGGGAAA GGACCAATTA TTTATAGTTT CCCAACAAAA GTTCTAAGAT 4620 

TTTTTACCTC TGCATCAGTG CATTTCTATT TATATCAAAA GGTGCTAAAA TGATTCAATT 4680 

TGCATTTTCT GATCCTGTAG TGCCTCTATA GAAGTACCCA CAGAAAGTAA AGTATCACAT 4740 

TTATAAATAC CAAAGATGTA ACAATTTTAA AATTTTCTAG ATTACTCCAA TAAAGTGTTT 4800 
TAAGTTTAAA AAAAAAAAAA AAAAAAAAA 



Seq ID No: 117 Protein sequence; 
Protein Accession #: NP 031387.1 



11 



21 



31 



41 



51 



MEGDRVAGRP 
AGESPALLTK 
ABSCTERTPP 
NTFQAVLASD 
PYFSLTSTEQ 
FSHATALESD 
TLDPHTKEGT 
IQPYPDGGPV 
YNAANKETCE 
LHVGHTPVHP 
GSENGFSLAG 
TAHISPYKEL 
TTQQIiNVDRV 
TGVDYTCECA 
HTCILITPPA 
RCHPAATCYN 
IPQCDEQGNF 
ERWRENLLEH 
TTPACIPTVA 
HGSIIVGIDY 
MYWTDSVLDK 
LBGENRRILI 
SIVSYADHFY 



VLSSLPVLIi 
PDSATSTWAP 
PQCWAWPPAM 
GSDSYALFLY 
SVKNLYQLSN 
YNEDNLDYYD 
SLGEVGGPDL 
PSEMDVPPAH 
HNHRQCSRHA 
TDVDLHAYIV 
AAFTHDMEVT 
YHYSDSTVTS 
FALYNDEERV 
SGYQGDGRNC 
NPCEDGSHTC 
TPGSFSCRCQ 
LPLQCHGSTG 
YGGTPRDDQY 
PPMVRPTPRP 
DCRERMVYWT 
IESALLDGSB 
NTDIGLPNGL 
HTDWRRDGW 



LQLLMLRAAA 
TASSPLRTSP 
CALASRALRA 
PANGLQFTjGT 
LGIPGVWAFH 
VNEEEAEYLP 
KGQVEPWDER 
PEEEIVLRSY 
PCTDYATGFC 
GNDGRAYTAI 
FYPGEETVR1 
TSSRDYSLTF 
LRFAVTNQIG 
VDENECATGF 
APAGQARCVH 
PGYYGDGPQC 
FCWCVDPDGH 
VPQCDDLGHF 
DVTPPSVGTF 
DVAGRTISRA 
RKVIiFYTDLV 
TPDPFSKLLC 
SVNKHSGQFT 



LHPDELFPHG 
GKRSMWTMIS 
FYPHPRLPGH 
RPKESYNVQL 
IGSTSPLDNV 
GEPEBALNGH 
BTRSPAPPEV 
PASGHTTPLS 
CHCQSKFYGN 
SHIPQPAAQA 
TQTAEGLDPE 
GAINQTWSYR 
PVKEDSDPTP 
HRCGPNSVCI 
HGGSTFSCAC 
IPDSTSSLTP 
EVPGTQTPPG 
IPLQCHGKSD 
LLYTQGQQIG 
GLELGAEPET 
NPRAIAVDPI 
HADAGTKKLB 
DEYLPEQRSH 



ESWWDQLLQE 
PPTSRPSPLF 
LGAGRRItRGG 
QLPARVGFCR 
RPAAVGDLSA 
SSIDVSFQSK 
DRDSLAPSWE 
RGTYEVGLED 
GKHCIiPBGAP 
LLPLTPIGGL 
NYLSIKTNIQ 
IHQNITYQVC 
VNPCYDGSHM 
NLPGSYRCEC 
LPGYAGDGHQ 
CEQQQRHAQA 
STPPHCGPSP 
FCWCVDKDGR 
YLPLSGTRLQ 
IVNSGLISPE 
RGNLYWTDWN 
CTLPDGTGRR 
LYGITAVYPY 



WRTSTRATAE 
QTRAIiPSGEL 
GEADDLKSEG 
AHSSVPLGRS 
VDTKPLEESS 
TPPPYPEHGS 
NIGSNTEVFT 
HRVNGKVSGH 
FGWLFAXiEKP 
GQVPYVPANF 
RHAPRHPSFP 
CDTTARCHPG 
RSGYBFADDR 
CTDVDECSEN 
CJYAYPGARFH 
EPTQRPPTIC 
EVQGTRSQPG 
KDAAKTLLSL 
GLAIDHIRRT 
REAPKIETSS 
VIQNNLKYPF 
CPTGRK 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ID NO: 118 DNA sequence 

Nucleic Acid Accession #: NM_003088.1 

Coding sequence: 112-1593 (underlined sequences correspond to start and stop codons) 
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PCT/US02/04915 



10 



15 



20 



25 



30 



35 



40 



45 



50 



GCGGAGGGTG 
CCCGCCACCC 
AACGGCACAG 
CTGACGGCCG 
CAGATCTGGA 
AGCCACCTGG 
GTGCCCGGTC 
CAGTCCGAGG 
CAGACGGTGT 
ATCTACAGTG 
GCCGTGGACC 
CAGCGCTACA 
GCGCGCCCCG 
CGCGACTGCG 
AAGGCCACCA 
GTGCTGCAGG 
AATCAGGACG 
AAGTGTGCCT 
CAGTCCACCG 
CGCATCACAC 
GCCGCCTCGG 
CCCATCATCG 
CTGGACGCCA 
AACATCAAAG 
AGCGGCGACA 
AAGGTGGGCG 
ACCGTGGACC 
CCACATGGCG 
GGCGGGAGGC 
CCTGTCGCCC 
TCAGCGGCTG 
CGGGGCGAGT 
GAAGCGGCTA 
TTTGCCTCTC 
CTGTCAGTGG 
CGGGAGGGCT 
CTCCCACGTG 
ACAGGGTCTG 
GGGCCGTCTT 
CAAATCAGTA 
GTAGTAGCGA 
CCCCCTCTTT 
GCCAGAGCCC 
CGCCCCCTCC 
TCCCCAACAT 
TATAACTCTA 
AGTCTGC 



11 
I 

CGTGCGGGCC 
ACCTCCCGGG 
CCGAGGCGGT 
AGGCGTTCGG 
CGCTGGAGCA 
GCCGCTACCT 
CCGACTGCCG 
CGCACCGGCG 
CCCCCGCCGA 
TCACCCGTAA 
GCGACGTGCC 
GCGTGCAGAC 
AGCCGGCCAC 
AGGGCCGTTA 
AGGTGGGCAA 
CGGCCAACGA 
AGGAGACCGA 
TCCGTACCCA 
CCTCCAGCAA 
TGAGGGCGTC 
TGGAGACAGC 
TGTTCCGCGG 
ACCGCTCCAG 
ACTCCACAGG 
CTCCTGTGGA 
GGCGCTACCT 
CCGCCTCGCT 
GCTCCTGCCA 
AAGCCCCCTT 
CTATGGACTC 
CGGCCTGGCC 
CTGGCACCTC 
AGGGACGGTT 
CCAGCCACCT 
CCCTCCCTGG 
AGGACTGACC 
GGAGAGGCTC 
CCCGCTGCAC 
CCTCCTGTCT 
TTTTTTTTAA 
GTGATCTGGC 
CCGTCCTTCC 
CTGCTGTGAT 
GGGAGCCCTG 
GCATCTCACT 
AACGCCCATG 



21 
I 

GCGGCAGCCG 
GCCGCGCAGC 
GCAGATCCAG 
GTTCAAGGTG 
GCCCCCTGAC 
GGCGGCGGAC 
TTTCCTCATC 
CTACTTCGGC 
GAAGTGGAGC 
GCGCTACGCG 
CTGGGGCGTC 
CGCCGACCAC 
TGGCTACACG 
CCTGGCGCCG 
GGACGAGCTC 
GAGGAACGTG 
CCAGGAGACC 
CACGGGCAAG 
GAATGCCAGC 
CAATGGCAAG 
AGGGGACTCA 
GGAGCATGGC 
CTATGACGTC 
CAAATACTGG 
CTTCTTCTTC 
GAAGGGCGAC 
CTGGGAGTAC 
ACCCTCCCTG 
GCCTTTCAAA 
CCCACTCTCC 
CTGGGAGGGA 
TTTCTTCTGA 
GGGGGCTGGG 
CCTCCCAGCC 
TGCACTGTCC 
CTTGTGGTGT 
AGCCTGGCTC 
GTTCTGCCAA 
CTTTCCTTTC 
TGAAATATTA 
GGGGGGCGTC 
CGTCCAGCCC 
TGGTGCTCCC 
GGGTGAGCCG 
CTGGGTGTCT 
ATAGTAGCTT 



31 
I 

AACAAAGGAG 
GGCCTCTCGT 
TTCGGCCTCA 
AACGCGTCCG 
GAGGCGGGCA 
AAGGACGGCA 
GTGGCGCACG 
GGCACCGAGG 
GTGCACATCG 
CACCTGAGCG 
GACTCGCTCA 
CGCTTCCTGC 
CTGGAGTTCC 
TCGGGGCCCA 
TTTGCTCTGG 
TCCACGCGCC 
TTCCAGCTGG 
TACTGGACGC 
TGCTACTTTG 
TTTGTGACCT 
GAGCTCTTCC 
TTCATCGGCT 
TTCCAGCTGG 
ACGGTGGGCA 
GAGTTCTGCG 
CACGCAGGCG 
TAGGGCCGGC 
CTAACCCCTT 
CTGGAAACCC 
CCTCCGCCCG 
TTTCAGATGC 
CCTCAGACGG 
AGCCCTGGGC 
CCCCAGGAGA 
CCGAAACCCC 
TTTTTTGGGT 
CCTTCCCTGG 
GGTGGTGGTG 
ACCCTAGCCT 
TTGCTGGAGG 
TCAGCACCCT 
CAGCCCTGGG 
TGGGCCTCCC 
CCGGGGCCCC 
TGGTCTTTTA 
CAAACTGGAA 



41 
I 

CAGGGGOGCC 
CTACTGCCAC 
TCAACTGCGG 
CCAGCAGCCT 
GCGOGGCCGT 
ACGTGACCTG 
ACGACGGTCG 
ACCGCCTGTC 
CCATGCACCC 
CGCGGCCGGC 
TCACCCTCGC 
GCCACGACGG 
GCTCCGGCAA 
GCGGCACGCT 
AGCAGAGCTG 
AGGGTATGGA 
AGATCGACCG 
TGACGGCCAC 
ACATCGAGTG 
CCAAGAAGAA 
TCATGAAGCT 
GCCGCAAGGT 
AGTTCAACGA 
GTGACTCCGC 
ACTATAACAA 
TCCTGAAG6C 
CCGTCCTTCC 
CTCCGCCAGG 
CAGAGAAAAC 
GGTTCCCTAC 
CCCTGCCCTC 
CTCTGAGCCT 
GTGTAGTGTA 
GCTGGGCACA 
TGCTTGGGAA 
GGTGGCTGGA 
AGCGGCAGGG 
GCGGGCGGGT 
GACTGGAAGC 
CGTCCCAGGC 
CCCCAGGGGG 
CCTGGGCTGC 
GGGTGGATGA 
CCTGCTGCCA 
TTTTTTGTAA 
ATAGCGAAAT 



51 
I 

GCCGCAGGGA 
CATGACCGCC 
CAACAAGTAC 
GAAGAAGAAG 
GTGCCTGCGC 
CGAGCGCGAG 

CTGCTTCGOG 
TCAGGTCAAC 
CGACGAGATC 
CTTCCAGGAC 
GCGCCTGGTG 
GGTGGCCTTC 
CAAGGCGGGC 
CGCCCAGGTC 
CCTGTCTGCC 
CGACACCAAA 
CGGGGGCGTG 
GCGTGACCGG 
TGGGCAGCTG 
CATCAACCGC 
CACGGGCACC 
TGGCGCCTAC 
GGTCACCAGC 
GGTGGCCATC 
CTCGGCGGAA 
CCGCCCCTGC 
TGGGCTCCAG 
GGTGCCCCCA 
TCCCCTCGGG 
TTGTCTGCCA 
TATTTCTCTG 
ACTGGAATCT 
TGTCCCAAGC 
GGGAAGCTGT 
AACAGCCCCT 
CGTGACGGCC 
AGGGGTGTGG 
AGAAAATGAC 
AAGCCTGGCT 
TGCATCTCAG 
CGACACCTGG 
AGCCAGGCGT 
GCCTCCCCCG 
GTGTCATTTG 
AAAATAACTC 



60 
120 
180 
240 
300 
360 
420 
460 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



Seq ID No: 119 Protein sequence! 
Protein Accession #: NP 003079-1 



55 



60 



65 



MTANGTAEAV 
CLRSHLGRYL 
CFAQTVSPAE 
FQDQRYSVQT 
KAGKATKVGK 
DTKKCAFRTH 
GQIiAASVETA 
GAYNIKDSTG 
SAETVDPASL 



11 
I 

QIQFGLINCG 
AADKDGNVTC 
KWSVHIAMHP 
ADHRFLRHDG 
DELFALEQSC 
TGKYWTLTAT 
GDSEIiPTjMKL 
KYWTVGSDSA 
WEY 



21 
I 

NKYLTAEAFG 
EREVPGPDCR 
QVNTYSVTRK 
RLVARPEPAT 
AQWLQAANE 
GGVQSTASSK 
INRPIIVFRG 
VTSSGDTPVD 



31 
I 

PKVNASASSL 
FLIVAHDDGR 
RYAHLSARPA 
GYTLBFRSGK 
RNVSTRQGMD 
KASCYFDIBW 
EHGFIGCRKV 
FFFEFCDYNK 



41 
I 

KKKQIWTLEQ 
WSLQSEAHRR 
DBIAVDRDVP 
VAFRDCEGRY 
LSANQDEETD 
RDRRITLRAS 
TGTLDANRSS 
VAIKVGGRYL 



51 
I 

PPDEAGSAAV 
YFGGTEDRLS 
WGVDSLITLA 
LAPSGPSGTL 
QETFQLEIDR 
NGKFVTSKKN 
YDVFQLEFND 
KGDHAGVLKA 



60 
120 
180 
240 
300 
360 
420 
480 



70 



Seq ID NO: 120 DNA sequence 

Nucleic Acid Accession #: NM_0 064 04.1 

Coding sequence: 25-741 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



CAGGTCCGGA GCCTCAACTT CAGGATGTTG ACAACATTGC TGCCGATACT GCTGCTGTCT - 60 
75 GGCTGGGCCT TTTGTAGCCA AGACGCCTCA GATGGCCTCC AAAGACTTCA TATGCTCCAG 120 
ATCTCCTACT TCCGCGACCC CTATCACGTG TGGTACCAGG GCAACGCGTC GCTGGGGGGA 180 



236 
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CACCTAACGC ACGTGCTGGA AGGCCCAGAC ACCAACACCA CGATCATTCA GCTGCAGCCC 240 

TTGCAGGAGC CCGAGAGCTG GGCGCGCACG CAGAGTGGCC TGCAGTCCTA CCTGCTCCAG 300 

TTCCACGGCC TCGTGCGCCT GGTGCACCAG GAGCGGACCT TGGCCTTTCC TCTGACCATC 360 

CGCTGCTTCC TGGGCTGTGA GCTGCCTCCC GAGGGCTCTA GAGCCCATGT CTTCTTCGAA 420 

GTGGCTGTGA ATGGGAGCTC CTTTGTGAGT TTCCGGCCGG AGAGAGCCTT GTGGCAGGCA 4 BO 

GACACCCAGG TCACCTCCGG AGTGGTCACC TTCACCCTGC AGCAGCTCAA TGCCTACAAC 540 

CGCACTCGGT ATGAACTGCG GGAATTCCTG GAGGACACCT GTGTGCAGTA TGTGCAGAAA 600 

CATATTTCCG CGGAAAACAC GAAAGGGAGC CAAACAAGCC GCTCCTACAC TTCGCTGGTC 660 

CTGGGCGTCC TGGTGGGCGG TTTCATCATT GCTGGTGTGG CTGTAGGCAT CTTCCTGTGC 720 

ACAGGTGGAC GGCGATG TTA AT TACTCTCC AGCCCCGTCA GAAGGGGCTG GATTGATGGA 780 

GGCTGGCAAG GGAAAGTTTC AGCTCACTGT GAAGCCAGAC TCCCCAACTG AAACACCAGA 840 

AGGTTTGGAG TGACAGCTCC TTTCTTCTCC CACATCTGCC CACTGAAGAT TTGAGGGAGG 900 

GGAGATGGAG AGGAGAGGTG GACAAAGTAC TTGGTTTGCT AAGAACCTAA GAACGTGTAT 960 

GCTTTGCTGA ATTAGTCTGA TAAGTGAATG TTTATCTATC TTTGTGGAAA ACAGATAATG 1020 

GAGTTGGGGC AGGAAGCCTA TGCGCCATCC TCCAAAGACA GACAGAATCA CCTGAGGCGT 1080 

TCAAAAGATA TAACCAAATA AACAAGTCAT CCACAATCAA AATACAACAT TCAATACTTC 1140 

CAGGTGTGTC AGACTTGGGA TGGGACGCTG ATATAATAGG GTAGAAAGAA GTAACACGAA 1200 

GAAGTGGTGG AAATGTAAAA TCCAAGTCAT ATGGCAGTGA TCAATTATTA ATCAATTAAT 1260 
AATATTAATA AATTTCTTAT ATTT 

Seq ID No: 121 Protein sequence; 
Protein Accession #: NP 006395.1 



11 



21 
I 



31 



41 



51 



MLTTLLPILIi LSGWAFCSQD ASDGLQRLHM LQISYFRDPY HVWYQGNASL GGHLTHVLEG 60 

PDTNTTIIQL QPLQEPESWA RTQSGLQSYL LQFHGLVRLV HQERTLAFPIi TIRCFLGCEL 120 

PPEGSRAHVF FEVAVNGSSF VSFRPBRALW QADTQVTSGV VTFTLQQIiNA YNRTRYELRE 180 

FLEDTCVQYV QKHISAENTK GSQTSRSYTS LVLGVLVGGP IIAGVAVGIF LCTGGRRC 



Seq ID NO: 122 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 2-505 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CGAGAAGCTG GGAGAGACAC CACTTGTCCC TGAACAAGAC AATTCAGTAA CATCTATTCC 60 

TGAGATTCCT CGATGGGGAT CACAGAGCAC GATGTCTACC CTTCAAATGT CCCTTCAAGC 120 

CGAGTCAAAG GCCACTATCA CCCCATCAGG GAGCGTGATT TCCAAGTTTA ATTCTACGAC 180 

TTCCTCTGCC ACTCCTCAGG CTTTCGACTC CTCCTCTGCC GTGGTCTTCA TATTTGTGAG 240 

CACAGCAGTA GTAGTGTTGG TGATCTTGAC CATGACAGTA CTGGGGCTTG TCAAGCTCTG 300 

CTTTCACGAA AGCCCCTCTT CCCAGCCAAG GAAGGAGTCT ATGGGCCCGC CGGGCCTGGA 360 

GAGTGATCCT GAGCCCGCTG CTTTGGGCTC CAGTTCTGCA CATTGCACAA ACAATGGGGT 420 

GAAAGTCGGG GACTGTGATC TGCGGGACAG AGCAGAGGGT GCCTTGCTGG CGGAGTCCCC 480 

TCTTGGCTCT AGTGATGC AT AG GGAAACAG GGGACATGGG CACTCCTGTG AACAGTTTTT 540 

CACTTTTGAT GAAACGGGGA ACCAAGAGGA ACTTACTTGT GTAACTGACA ATTTCTGCAG 600 

AAATCCCCCT TCCTCTAAAT TCCCTTTACT CCACTGAGGA GCTAAATCAG AACTGCACAC 660 

TCCTTCCCTG ATGATAGAGG AAGTGGAAGT GCCTTTAGGA TGGTGATACT GGGGGACCGG 720 

GTAGTGCTGG GGAGAGATAT TTTCTTATGT TTATTCGGAG AATTTGGAGA AGTGATTGAA 780 

CTTTTCAAGA CATTGGAAAC AAATAGAACA CAATATAATT TACATTAAAA AATAATTTCT 840 

ACCAAAATGG AAAGGAAATG TTCTATGTTG TTCAGGCTAG GAGTATATTG GTTCGAAATC 900 
CCAGGGAAAA AAATAAAAAT AAAAAATTAA AGGATTGTTG ATAAAA 

Seq ID NO: 123 Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I I I I 

EKLGETPLVP EQDNSVTSIP EIPRWGSQST MSTLQMSLQA ESKATITPSG SVISKFNSTT 60 

SSATPQAFDS SSAWFIFVS TAVWLVTLT MTVLGLVKLC FHESPSSQPR KESMGPPGLE 120 
SDPEPAALGS SSAHCTNNGV KVGDCDLRDR AEGALLAESP LGSSDA 



Seq ID NO: 124 DNA sequence 

Nucleic Acid Accession #: NM_006500.1 

Coding sequence: 27-1967 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I 1 I 

ACTTGCGTCT CGCCCTCCGG CCAAGCATGG GGCTTCCCAG GCTGGTCTGC GCCTTCTTGC 60 

TCGCCGCCTG CTGCTGCTGT CCTCGCGTCG CGGGTGTGCC CGGAGAGGCT GAGCAGCCTG 120 

CGCCTGAGCT GGTGGAGGTG GAAGTGGGCA GCACAGCCCT TCTGAAGTGC GGCCTCTCCC 180 

AGTCCCAAGG CAACCTCAGC CATGTCGACT GGTTTTCTGT CCACAAGGAG AAGCGGACGC 240 

TCATCTTCCG TGTGCGCCAG GGCCAGGGCC AGAGCGAACC TGGGGAGTAC GAGCAGCGGC 300 

TCAGCCTCCA GGACAGAGGG GCTACTCTGG CCCTGACTCA AGTCACCCCC CAAGACGAGC 360 
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GCATCTTCTT GTGCCAGGGC AAGCGCCCTC GGTCCCAGGA GTACCGCATC CAGCTCCGCG 420 

TCTACAAAGC TCCGGAGGAG CCAAACATCC AGGTCAACCC CCTGGGCATC CCTGTGAACA 480 

GTAAGGAGCC TGAGGAGGTC GCTACCTGTG TAGGGAGGAA CGGGTACCCC ATTCCTCAAG 540 

TCATCTGGTA CAAGAATGGC CGGCCTCTGA AGGAGGAGAA GAACCGGGTC CACATTCAGT 600 

CGTCCCAGAC TGTGGAGTCG AGTGGTTTGT ACACCTTGCA GAGTATTCTG AAGGCACAGC 660 

TGGTTAAAGA AGACAAAGAT GCCCAGTTTT ACTGTGAGCT CAACTACCGG CTGCCCAGTG 720 

GGAACCACAT GAAGGAGTCC AGGGAAGTCA CCGTCCCTGT TTTCTACCCG ACAGAAAAAG 780 

TGTGGCTGGA AGTGGAGCCC GTGGGAATGC TGAAGGAAGG GGACOGCGTG GAAATCAGGT 840 

GTTTGGCTGA TGGCAACCCT CCACCACACT TCAGCATCAG CAAGCAGAAC CCCAGCACCA 900 

GGGAGGCAGA GGAAGAGACA ACCAACGACA ACGGGGTCCT GGTGCTGGAG CCTGCCCGGA 960 

AGGAACACAG TGGGCGCTAT GAATGTCAGG CCTGGAACTT GGACACCATG ATATCGCTGC 1020 

TGAGTGAACC ACAGGAACTA CTGGTGAACT ATGTGTCTGA CGTCCGAGTG AGTCCCGCAG 1080 

CCCCTGAGAG ACAGGAAGGC AGCAGCCTCA CCCTGACCTG TGAGGCAGAG AGTAGCCAGG 1140 

ACCTCGAGTT CCAGTGGCTG AGAGAAGAGA CAGACCAGGT GCTGGAAAGG GGGCCTGTGC 1200 

TTCAGTTGCA TGACCTGAAA CGGGAGGCAG GAGGCGGCTA TCGCTGCGTG GCGTCTGTGC 1260 

CCAGCATACC CGGCCTGAAC CGCACACAGC TGGTCAAGCT GGCCATTTTT GGCCCCCCTT 1320 

GGATGGCATT CAAGGAGAGG AAGGTGTGGG TGAAAGAGAA TATGGTGTTG AATCTGTCTT 1380 

GTGAAGCGTC AGGGCACCCC CGGCCCACCA TCTCCTGGAA CGTCAACGGC ACGGCAAGTG 1440 

AACAAGACCA AGATCCACAG CGAGTCCTGA GCACCCTGAA TGTCCTCGTG ACCCCGGAGC 1500 

TGTTGGAGAC AGGTGTTGAA TGCACGGCCT CCAACGACCT GGGCAAAAAC ACCAGCATCC 1560 

TCTTCCTGGA GCTGGTCAAT TTAACCACCC TCACACCAGA CTCCAACACA ACCACTGGCC 1620 

TCAGCACTTC CACTGCCAGT CCTCATACCA GAGCCAACAG CACCTCCACA GAGAGAAAGC 1680 

TGCCGGAGCC GGAGAGCCGG GGCGTGGTCA TCGTGGCTGT GATTGTGTGC ATCCTGGTCC 1740 

TGGCGGTGCT GGGOGCTGTC CTCTATTTCC TCTATAAGAA GGGCAAGCTG COGTGCAGGC 1800 

GCTCAGGGAA GCAGGAGATC ACGCTGCCCC CGTCTCGTAA GACCGAACTT GTAGTTGAAG 1860 

TTAAGTCAGA TAAGCTCCCA GAAGAGATGG GCCTCCTGCA GGGCAGCAGC GGTGACAAGA 1920 

GGGCTCCGGG AGACCAGGGA GAGAAATACA TCGATCTGAG GCATTAGCCC CGAATCACTT 1980 

CAGCTCCCTT CCCTGCCTGG ACCATTCCCA GCTCCCTGCT CACTCTTCTC TCAGCCAAAG 2040 

CCTCCAAAGG GACTAGAGAG AAGCCTCCTG CTCCCCTCAC CTGCACACCC CCTTTCAGAG 2100 

GGCCACTGGG TTAGGACCTG AGGACCTCAC TTGGCCCTGC AAGCCGCTTT TCAGGGACCA 2160 

GTCCACCACC ATCTCCTGCA OGTTGAGTGA AGCTCATCCC AAGCAAGGAG CCCCAGTCTC 2220 

CCGAGCGGGT AGGAGAGTTT CTTGCAGAAC GTGTTTTTTC TTTACACACA TTATGGCTGT 2280 

AAATACCTGG CTCCTGCCAG CAGCTGAGCT GGGTAGCCTC TCTGAGCTGG TTTCCTGCCC 2340 

CAAAGGCTGG CTTCCACCAT CCAGGTGCAC CACTGAAGTG AGGACACACC GGAGCCAGGC 2400 

GCCTGCTCAT GTTGAAGTGC GCTGTTCACA CCCGCTCCGG AGAGCACCCC AGCGGCATCC 2460 

AGAAGCAGCT GCAGTGTTGC TGCCACCACC CTCCTGCTCG CCTCTTCAAA GTCTCCTGTG 2520 

ACATTTTTTC TTTGGTCAGA AGCCAGGAAC TGGTGTCATT CCTTAAAAGA TACGTGCCGG 2580 

GGCCAGGTGT GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCGA GGCGGGCGGA 2640 

TCACAAAGTC AGGACGAGAC CATCCTGGCT AACACGGTGA AACCCTGTCT CTACTAAAAA 2700 

TACAAAAAAA AATTAGCTAG GCGTAGTGGT TGGCACCTAT AGTCCCAGCT ACTCGGAAGG 2760 

CTGAAGCAGG AGAATGGTAT GAATCCAGGA GGTGGAGCTT GCAGTGAGCC GAGACCGTGC 2820 

CACTGCACTC CAGCCTGGGC AACACAGCGA GACTCCGTCT CGAGGAAAAA AAAAGAAAAG 2880 

ACGCGTACCT GCGGTGAGGA AGCTGGGCGC TGTTTTCGAG TTCAGGTGAA TTAGCCTCAA 2940 

TCCCCGTGTT CACTTGCTCC CATAGCCCTC TTGATGGATC ACGTAAAACT GAAAGGCAGC 3000 

GGGGAGCAGA CAAAGATGAG GTCTACACTG TCCTTCATGG GGATTAAAGC TATGGTTATA 3060 

TTAGCACCAA ACTTCTACAA ACCAAGCTCA GGGCCCCAAC CCTAGAAGGG CCCAAATGAG 3120 

AGAATGGTAC TTAGGGATGG AAAACGGGGC CTGGCTAGAG CTTCGGGTGT GTGTGTCTGT 3180 

CTGTGTGTAT GCATACATAT GTGTGTATAT ATGGTTTTGT CAGGTGTGTA AATTTGCAAA 3240 

TTGTTTCCTT TATATATGTA TGTATATATA TATATGAAAA TATATATATA TATGAAAAAT 3300 

AAAGCTTAAT TGTCCCAGAA AATCATACAT TGCTTTTTTA TTCTACATGG GTACCACAGG 3360 

AACCTGGGGG CCTGTGAAAC TACAACCAAA AGGCACACAA AACCGTTTCC AGTTGGCAGC 3420 

AGAGATCAGG GGTTACCTCT GCTTCTGAGC AAATGGCTCA AGCTCTACCA GAGCAGACAG 3480 

CTACCCTACT TTTCAGCAGC AAAACGTCCC GTATGACGCA GCACGAAGGG CCTGGCAGGC 3540 
TGTTAGCAGG AGCTATGTCC CTTCCTATCG TTTCCGTCCA CTT 

Seq ID No: 125 Protein sequence: 
Protein Accession #: NP_006491.1 



1 11 21 31 41 51 

I I I I I ! 

MGLPRLVCAF LLAACCCCPR VAGVPGEAEQ PAPELVEVEV GSTALLKCGL SQSQGNLSHV 60 

DWFSVHKEKR TLIFRVRQGQ GQSEPGSYEQ RLSLQDRGAT LALTQVTPQD ERIPLCQGKR 120 

PRSQEYRIQL RVYKAPEEPN IQVNPLGIPV NSKEPEEVAT CVGRNGYPIP QVIWYKNGRP 180 

LKEEKNRVHI QSSQTVESSG LYTLQSILKA QLVKEDKDAQ FYCELNYRLP SGNHMKESRB 240 

VTVPVFYPTE KVWLEVEFVG MLKEGDRVEI RCLADGNPPP HFSISKQNPS TREAEEETTN 300 

DNGVLVLEPA RKEHSGRYEC QAWNLDTMIS IiLSEPQELLV NYVSDVRVSP AAPERQEGSS 360 

LTLTCEAESS QDLEFQWLRE ETDQVLERGP VLQLHDLKRE AGGGYRCVAS VPSIPGLNRT 420 

QLVKLAIFGP PWMAFKERKV WVKENMVLNL SCEASGHPRP TISWNVNGTA SEQDQDPQRV 480 

LSTLNVLVTP ELLBTGVECT ASNDLGKNTS ILFLELVNLT TLTPDSNTTT GLSTSTASPH 540 

TRANSTSTER KLPEPESRGV VTVAVIVCIL VLAVLGAVLY FLYKKGKLPC RRSGKQEITL 600 
PPSRKTELW EVKSDKLPEE MGLLQGSSGD KRAPGDQGEK YIDLRH 



Seq ID NO: 126 DNA sequence 

Nucleic Acid Accession #: NM 001955.1 



238 



WO 02/079492 



PCT/US02/04915 



Coding sequence: 337-975 (underlined sequences correspond to start and stop'codons) 



1 11 21 31 41 51 

I I I I I I 

GGAGCTGTTT ACCCCCACTC TAATAGGGGT TCAATATAAA AAGCCGGCAG AGAGCTGTCC 60 

AAGTCAGACG CGCCTCTGCA TCTGCGCCAG GCGAACGGGT CCTSOGCCTC CTGCAGTCCC 120 

AGCTCTCCAC CACCGCCGCG TGCGCCTGCA GACGCTCCGC TCGCTGCCTT CTCTCCTGGC 180 

AGGCGCTGCC TTTTCTCCCC GTTAAAGGGC ACTTGGGCTG AAGGATCGCT TTGAGATCTG 240 

AGGAACCCGC AGCGCTTTGA GGGACCTGAA GCTGTTTTTC TTCGTTTTCC TTTGGGTTCA 300 

GTTTGAACGG GAGGTTTTTG ATCCCTTTTT TTCAGAATGG ATTATTTGCT CATGATTTTC 360. 

TCTCTGCTGT TTGTGGCTTG CCAAGGAGCT CCAGAAACAG CAGTCTTAGG CGCTGAGCTC 420 

AGCGCGGTGG GTGAGAACGG CGGGGAGAAA CCCACTCCCA GTCCACCCTG GCGGCTCCGC 480 

CGGTCCAAGC GCTGCTCCTG CTCGTCCCTG ATGGATAAAG AGTGTGTCTA CTTCTGCCAC 540 

CTGGACATCA TTTGGGTCAA CACTCCCGAG CACGTTGTTC CGTATGGACT TGGAAGCCCT 600 

AGGTCCAAGA GAGCCTTGGA GAATTTACTT CCCACAAAGG CAACAGACCG TGAGAATAGA 660 

TGCCAATGTG CTAGCCAAAA AGACAAGAAG TGCTGGAATT TTTGCCAAGC AGGAAAAGAA 720 

CTCAGGGCTG AAGACATTAT GGAGAAAGAC TGGAATAATC ATAAGAAAGG AAAAGACTGT 780 

TCCAAGCTTG GGAAAAAGTG TATTTATCAG CAGTTAGTGA GAGGAAGAAA AATCAGAAGA 840 

AGTTCAGAGG AACACCTAAG ACAAACCAGG TCGGAGACCA TGAGAAACAG CGTCAAATCA 900 

TCTTTTCATG ATCCCAAGCT GAAAGGCAAG CCCTCCAGAG AGCGTTATGT GACCCACAAC 960 

CGAGCACATT G GTGA CAGAC TTCGGGGCCT GTCTGAAGCC ATAGCCTCCA CX3GAGAGCCC 1020 

TGTGGCCGAC TCTGCACTCT CCACCCTGGC TGGGATCAGA GCAGGAGCAT CCTCTGCTGG 1080 

TTCCTGACTG GCAAAGGACC AGCGTCCTCG TTCAAAACAT TCCAAGAAAG GTTAAGGAGT 1140 

TCCCCCAACC ATCTTCACTG GCTTCCATCA GTGGTAACTG CTTTGGTCTC TTCTTTCATC 1200 
TGGGGATGAC AATGGACCTC TCAGCAGAAA CACACAGTCA CATTCGAATT C 

Seq ID No: 127 Protein sequence: 
Protein Accession #: NP 001946.1 



1 11 21 

I I I 

MDYLLMIFSL LFVACQGAPE TAVLGAELSA 
KECVYFCHLD IIWVNTPEHV VPYGLGSPRS 
NFCQAGKELR AEDIMBKDWN NHKKGKDCSK 
TMRNSVKSSF HDPKLKGKPS RERYVTHNRA 



31 41 51 

I I I 

VGENGGEKPT PSPPWRLRRS KRCSCSSLMD 60 
KRALENIiLPT KATDRENRCQ CASQKDKKCW 120 
LGKKCIYQQL VRGRKIRRSS EEHLRQTRSE 180 
HW 



Seq ID NO: 128 DNA sequence 

Nucleic Acid Accession #: NM_001721.1 

Coding sequence: 34-2061 (underlined sequences correspond to start and stop codons) 



1 11 21 

I 1 I 

GCAAGCACGG AACAAGCTGA GACGGATGAT 
CTTCTTCTCA AAAGATCACA GCAAAAGAAG 
CTTTTTGTTT TGACCAAAAC AAACCTTTCC 
AGCAGAAAAG GATCCATTGA AATTAAGAAA 
GAGCAGACGC CTGTAGAGAG ACAGTACCCA 
TATGTCTATG CATCAAATGA AGAGAGCCGA 
ATAAGGGGTA ACCCCCACCT GCTGGTCAAG 
TTCCTGTGTT GCCAGCAGAG CTGTAAAGCA 
GCTAATCTGC ATACTGCAGT CAATGAAGAG 
GTGCTGAAGA TACCTCGGGC AGTTCCTGTT 
ACTCTAGCCC AATATGACAA CGAATCAAAG 
AGTACCAGTC TAGCGCAATA TGACAGCAAC 
TTCAACATGC AGTATATTCC AAGGGAAGAC 
AAAAGTAGCA GCAGCAGTGA AGATGTTGCA 
CACACCACCT CAAAGATTTC ATGGGAATTC 
AACCTGGATG ATTATGACTG GTTTGCTGGT 
CTCAGACAAA AGGGAAAAGA AGGAGCATTT 
TACACAGTGT CCTTATTTAG TAAGGCTGTG 
CACGTGCATA CAAATGCTGA GAACAAATTA 
ATTCCAAAGC TTATTCATTA TCATCAACAC 
CACCCTGTGT CAACAAAGGC CAACAAGGTC 
TGGGAACTGA AAAGAGAAGA GATTACCTTG 
GTGGTCCAGC TGGGCAAGTG GAAGGGGCAG 
GGCTCCATGT CAGAAGATGA ATTCTTTCAG 
CCCAAGCTGG TTAAATTCTA TGGAGTGTGT 
GAATATATAA GCAATGGCTG CTTGCTGAAT 
CCTTCCCAGC TCTTAGAAAT GTGCTACGAT 
CACCAATTCA TACACCGGGA CTTGGCTGCT 
GTGAAAGTAT CTGACTTTGG AATGACAAGG 



31 41 * 51 

I I I 

AATATGGATA CAAAATCTAT TCTAGAAGAA 60 

AAAATGTCAC CAAATAATTA CAAAGAACGG 120 

TACTATGAAT ATGACAAAAT GAAAAGGGGC 160 

ATCAGATGTG TGGAGAAAGT AAATCTCGAG 240 

TTTCAGATTG TCTATAAAGA TGGGCTTCTC 300 

AGTCAGTGGT TGAAAGCATT ACAAAAAGAG 360 

TACCATAGTG GGTTCTTCGT GGACGGGAAG 420 

GCCCCAGGAT GTACCCTCTG GGAAGCATAT 480 

AAACACAGAG TTCCCACCTT CCCAGACAGA 540 

CTCAAAATGG ATGCACCATC TTCAAGTACC 600 

AAAAACTATG GCTCCCAGCC ACCATCTTCA 660 

TCAAAGAAAA TCTATGGCTC CCAGCCAAAC 720 

TTCCCTGACT GGTGGCAAGT AAGAAAACTG 780 

AGCAGTAACC AAAAAGAAAG AAATGTGAAT 840 

CCTGAGTCAA GTTCATCTGA AGAAGAGGAA 900 

AACATCTCCA GATCACAATC TGAACAGTTA 960 

ATGGTTAGAA ATTCGAGCCA AGTGGGAATG 1020 

AATGATAAAA AAGGAACTGT CAAACATTAC 1080 

TACCTGGCAG AAAACTACTG TTTTGATTCC 1140 

AATTCAGCAG GCATGATCAC ACGGCTCCGC 1200 

CCCGACTCTG TGTCCCTGGG AAATGGAATC 1260 

TTGAAGGAGC TGGGAAGTGG CCAGTTTGGA 1320 

TATGATGTTG CTGTTAAGAT GATCAAGGAG 1380 

GAGGCCCAGA CTATGATGAA ACTCAGCCAT 1440 

TCAAAGGAAT ACCCCATATA CATAGTGACT 1500 

TACCTGAGGA GTCACGGAAA AGGACTTGAA 1560 

GTCTGTGAAG GCATGGCCTT CTTGGAGAGT 1620 

CGTAACTGCT TGGTGGACAG AGATCTCTGT 1680 

TATGTTCTTG ATGACCAGTA TGTCAGTTCA 1740 
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GTCGGAACAA AGTTTCCAGT CAAGTGGTCA GCTCCAGAGG TGTTTCATTA CTTCAAATAC 1800 

AGCAGCAAGT CAGACGTATG GGCATTTGGG ATCCTGATGT GGGAGGTGTT CAGCCTGGGG 1860 

AAGCAGCCCT ATGACTTGTA TGACAACTCC CAGGTGGTTC TGAAGGTCTC CCAGGGCCAC 1920 

AGGCTTTACC GGCCCCACCT GGCATCGGAC ACCATCTACC AGATCATGTA CAGCTGCTGG 1980 

CACGAGCTTC CAGAAAAGCG TCCCACATTT CAGCAACTCC TGTCTTCCAT TGAACCACTT 2040 

CGGGAAAAAG ACAAGCA TTG AA GAAGAAAT TAGGAGTGCT GATAAGAATG AATATAGATG 2100 

CTGGCCAGCA TTTTCATTCA TTTTAAGGAA AGTAGGAAGG CATAAGTAAT TTTAGCTAGT 2160 

TTTTAATAGT GTTCTCTGTA TTGTCTATTA TTTAGAAATG AACAAGGCAG GAAACAAAAG 2220 

ATTCCCTTGA AATTTAGATC AAATTAGTAA TTTTGTTTTA TGCTGCTCCT GATATAACAC 2280 

TTTCCAGCCT ATAGCAGAAG CACATTTTCA GACTGCAATA TAGAGACTGT GTTCATGTGT 2340 

AAAGACTGAG CAGAACTGAA AAATTACTTA TTGGATATTC ATTCTTTTCT TTATATTGTC 2400 
ATTGTCACAA CAATTAAATA TACTACCAAG TACAGAAATG TGGAAAAAAA AAACCG 

Seq ID No: 129 Protein, sequence: 
Protein Accession #: NP 001712.1 



1 11 21 31 41 51 

I I I I I" I 

MDTKSILEEL LIiKRSQQKKX MSPNNYKERL FVLTKTNLSY YEYDKMKRGS RKGSIBIKKI 60 

RCVBKVNLEE QTPVBRQYPF QIVYKDGLLY VYASNEESRS QWLKALQKEI RGNPHLLVKY 120 

HSGPFVDGKF LCCQQSCKAA PGCTLWEAYA NLHTAVNEEK HRVPTFPDRV LKIPRAVPVL 180 

KMDAPSSSTT LAQYDNBSKK NYGSQPPSSS TSLAQYDSNS KKIYGSQPNF NMQYIPREDF 240 

PDWWOVRKLK SSSSSEDVAS SNQKERNVNH TTSKISWEPP BSSSSBEEEN LDDYDWFAGN 300 

ISRSQSEQLL RQKGKEGAFM VRNSSQVGMY TVSLFSKAVN DKKGTVKHYH VHTNAENKLY 360 

IAENYCFDSI PKLIHYHQHN SAGMITRLRH PVSTKANKVP DSVSLGNGIW ELKREEITLL 420 

KELGSGQFGV VQLGKWKGQY DVAVKMIKEG SMSEDEFPQE AQTMMKLSHP KLVKFYGVCS 480 

KBYPIYIVTE YISNGCLLNY LRSHGRGLEP SQI.T.KMCYDV CEGMAFLESH QPIHRDIiAAR 540 

NCLVDRDLCV KVSDFGMTRY VliDDQYVSSV GTKFPVKWSA PEVFHYPKYS SKSDVWAFGI 600 

LMWBVFSbGK QPYDLYDNSQ WLKVSQGHR LYRPHLASDT IYQIMYSCWH ELPEKRPTFQ 660 
QLLSSIEPLR EKDKH 



Seq ID NO: 130 DNA sequence 

Nucleic Acid Accession #: NM_012072.2 

Coding sequence: 149-2107 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

AAAGCCCTCA GCCTTTGTGT CCTTCTCTGC GCCGGAGTGG CTGCAGCTCA CCCCTCAGCT 60 

CCCCTTGGGG CCCAGCTGGG AGCCGAGATA GAAGCTCCTG TCGCCGCTGG GCTTCTCGCC 120 

TCCCGCAGAG GGCCACACAG AGACCGG GAT GG CCACCTCC ATGGGCCTGC TGCTGCTGCT 180 

GCTGCTGCTC CTGACCCAGC CCGGGGCGGG GACGGGAGCT GACACGGAGG CGGTGGTCTG 240 

CGTGGGGACC GCCTGCTACA CGGCCCACTC GGGCAAGCTG AGCGCTGCCG AGGCCCAGAA 300 

CCACTGCAAC CAGAACGGGG GCAACCTGGC CACTGTGAAG AGCAAGGAGG AGGCCCAGCA 360 

CGTCCAGCGA GTACTGGCCC AGCTCCTGAG GCGGGAGGCA GCCCTGACGG CGAGGATGAG 420 

CAAGTTCTGG ATTGGGCTCC AGCGAGAGAA GGGCAAGTGC CTGGACCCTA GTCTGCCGCT 480 

GAAGGGCTTC AGCTGGGTGG GCGGGGGGGA GGACACGCCT TACTCTAACT GGCACAAGGA 540 

GCTCCGGAAC TCGTGCATCT CCAAGCGCTG TGTGTCTCTG CTGCTGGACC TGTCCCAGCC 600 

GCTCCTTCCC AACCGCCTGC CCAAGTGGTC TGAGGGCCCC TGTGGGAGCC CAGGCTCCCC 660 

CGGAAGTAAC ATTGAGGGCT TCGTGTGCAA GTTCAGCTTC AAAGGCATGT GCCGGCCTCT 720 

GGCCCTGGGG GGCCCAGGTC AGGTGACCTA CACCACCCCC TTCCAGACCA CCAGTTCCTC 780 

CTTGGAGGCT GTGCCCTTTG CCTCTGCGGC CAATGTAGCC TGTGGGGAAG GTGACAAGGA 840 

CGAGACTCAG AGTCATTATT TCCTGTGCAA GGAGAAGGCC CCCGATGTGT TCGACTGGGG 900 

CAGCTCGGGC CCCCTCTGTG TCAGCCCCAA GTATGGCTGC AACTTCAACA ATGGGGGCTG 960 

CCACCAGGAC TGCTTTGAAG GGGGGGATGG CTCCTTCCTC TGOGGCTGCC GACCAGGATT 1020 

CCGGCTGCTG GATGACCTGG TGACCTGTGC CTCTCGAAAC CCTTGCAGCT CCAGCCCATG 1080 

TCGTGGGGGG GCCACGTGCG TCCTGGGACC CCATGGGAAA AACTACACGT GCCGCTGCCC 1140 

CCAAGGGTAC CAGCTGGACT CGAGTCAGCT GGACTGTGTG GACGTGGATG AATGCCAGGA 1200 

CTCCCCCTGT GCCCAGGAGT GTGTCAACAC CCCTGGGGGC TTCCGCTGCG AATGCTGGGT 1260 

TGGCTATGAG CCGGGCGGTC CTGGAGAGGG GGCCTGTCAG GATGTGGATG AGTGTGCTCT 1320 

GGGTCGCTCG CCTTGCGCCC AGGGCTGCAC CAACACAGAT GGCTCATTTC ACTGCTCCTG 1380 

TGAGGAGGGC TACGTCCTGG CCGGGGAGGA CGGGACTCAG TGCCAGGAOG TGGATGAGTG 1440 

TGTGGGCCCG GGGGGCCCCC TCTGCGACAG CTTGTGCTTC AACACACAAG GGTCCTTCCA 1500 

CTGTGGCTGC CTGCCAGGCT GGGTGCTGGC CCCAAATGGG GTCTCTTGCA CCATGGGGCC 1560 

TGTGTCTCTG GGACCACCAT CTGGGCCCCC CGATGAGGAG GACAAAGGAG AGAAAGAAGG 1620 

GAGCACCGTG CCCCGCGCTG CAACAGCCAG TCCCACAAGG GGCCCCGAGG GCACCCCCAA 1680 

GGCTACACCC ACCACAAGTA GACCTTCGCT GTCATCTGAC GCCCCCATCA CATCTGCCCC 1740 

ACTCAAGATG CTGGCCCCCA GTGGGTCCTC AGGCGTCTGG AGGGAGCCCA GCATCCATCA 1800 

CGCCACAGCT GCCTCTGGCC CCCAGGAGCC TGCAGGTGGG GACTCCTCCG TGGCCACACA 1860 

AAACAACGAT GGCACTGACG GGCAAAAGCT GCTTTTATTC TACATCCTAG GCACCGTGGT 1920 

GGCCATCCTA CTCCTGCTGG CCCTGGCTCT GGGGCTACTG GTCTATCGCA AGCGGAGAGC 1980 

GAAGAGGGAG GAGAAGAAGG AGAAGAAGCC CCAGAATGCG GCAGACAGTT ACTCCTGGGT 2040 

TCCAGAGCGA GCTGAGAGCA GGGCCATGGA GAACCAGTAC AGTCCGACAC CTGGGACAGA 2100 

CTGCTGAAAG TGAGGTGGCC CTAGAGACAC TAGAGTCACC AGCCACCATC CTCAGAGCTT 2160 
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TGAACTCCCC ATTCCAAAGG GGCACCCACA 
CAAACAATTG TAAGTCTCCT CCTTAAAGGC 
TGTTTGATGT TCCTGAAGTG GAAGCTGTGT 
TCTATAATGA TTGTTACTCC CCCTCCCTTT 
5 GGGTGTGAGG AGGCTGGGGC TAAGGGGCTC 
ATCTAAGAGG AAAAGGTGAG TTGCTCATGC 
CCTAGGATGA AAACTAAATC AATTAATTAT 
TCAAAGGGAA CATGTTCGGA CTGGAAACAT 
AGCACAAGTC TTGCTAAATG TGATACTGTT 

10 TAACCTCTTA GGTGGCAAGG AGGCAGGAAG 
CTTGGGTTTA TTTGCAAAGG AAGCTTGAAA 
CAGGTGTTTG TGAAGTCACA TAATCTACGG 
CACAGATACT TGAATTAATT CATCCAAATG 
TGTGATCAAC ACTAACAAGG AAACAAATTC 

15 CCTCAGACAC CCTGCCTGTG GCCCCGCCTC 
CGAGCTCAGA CAGAGGAAGC CCTGCAGAAA 
TGAACGGGAG ATGATGCACT GTGTTTTGAA 
TCATAGTCCA CAGTTGATGC AGCATCCTGA 
CACACCAAGT AGGGAGCTAG TCAGGCAGTT 

20 TTCCTTAAAA TTGGGGGTAA GGAGGGAAGG 
TTTTTACAGC AAAAACTGCT CAAAGCCATT 
TTGCAAATAT TTCTCCCTAT GATAATGCAG 
TCTCTCTCAC ACACACACAC ACACACACAC 
CCTGGGGCAC TGGAACACAT TCCTGGGGGT 

25 TGAGTATCTC TGGGAGGCCT CATGTCTCCT 
ACAGACAGAG GAAATGTGTC TCCCTCCAAG 
GGTTTTGCCT TAGCAATGCA TCGGTCTCTG 
CAAGGTGCAG GGTTAATACT CTTGCCAGTT 
TTTTAATAGA AAACTAAAGG GGCAGGGGAA 

30 TOGATGGGGC ATTTGGAACT TCTTTTTAAA 
AACTCTGGTG TTTAACACTT AAGGGAGACA 
GGCCACGAGA CTCTAGGTGA TGTGTGAAGC 
CTGTCTGGCC ATTCAGAGGA TTCTAAAGAC 
CACTTAAATA AATGCAAATG CAACATTTCT 

35 TCATTTGGGG TGAAGGAGAC ATTTCTGTCC 
GTATGATTCC TGGGATCCAA CGAGCCCTCC 
GCCCAGGCCC ATCGTCTGTT CTCTGAATGC 
GAACCCCTCT GTGGAACCCA CAAGGGGAGA 
ACCTTCCCTG GCAGGCTGGG TCCCTCTCCT 

40 CCCACCACGG GGGGAGAGCC AGCAACCCAA 
AACCACTGGG CTCAAACACG TGCTTTATTC 
ATGGAAATTC TTGTTTGGGG GATCTTGGGG 
GCCAAGAGGC CATTAACAAA TCGTCCTTGT 
CACAGTGGGG AATCCAAGGG TCACAGTATG 

45 TCTOGCTAGA CACAGTGTTT CTGCCCAGGT 
CATGGGGACG GGGGAAGTTT TCACTTGGAG 
CCAAATAGGT CAATAATTCT GGGAGACTCT 
TCTCTCCCTC CCCTCATCCC ACATCTCAAA 
CACCCAGCTC GCGATGCCTA CTCATTCCTG 

50 CTTCTTTGTC ATTTGAGAAA GGATGCAGGA 
CAGAAAAACC AGGGCAGGAC AGTTATCGAC 
TAGAGGGACT CCACCCCTGC TCAACAGCTT 
CTCTGCCTTC GGTGGCCCAC ACACCTAAGC 
AACACATCTA CGTGTAGCAC TACGAOGTTA 

55 AGGCTCTGAT TAAGGATGTG GGGAAGTGGG 
CTGGAGGCCT GTCTGTTAGC CAGTGGTGGA 
TGCCATCTTC CCTGCGATCA GGCAAAAAAG 
TGTGTTATGT CCATTTTGCA GGATGAACTG 
TTGCTTTGTC TTTTCCATCC TCATCACAAG 

60 TCTTTCGATG GATGGAGATG ATCATTAGGT 
ATTTCTGTGA AAACTAGGAG AACAGAGATG 
TAACACAGTC TTTTTAAAAC TAACATAGGA 
TCTCCATTGT CTAAATCAGG AAAACAGGAA 
TTAATGCCCC CTACATATTT CCATCACCTT 

65 TATGATCCCA GAAAACATCT GTCTCTACTT 
TGGTTTGTGC ATTTTCTCAA CTAAAAATAG 
CTAATCAAAG ACACTATTTT CATACTAGAT 
TTTAAAAATA AATTGTGTTT TGGTCTGTTC 
AAGCTCTGGA ATCCCTTTAT TGTGCTGTTG 

70 TTCAGCAGAT TTTGCCCACT ATTCCTCTGA 
GCTTGAATTA GATCCCTGCA AAGGCTTGCT 
GTAATCACTT CATGAATGCT AAATGAGAAT 
TTTGTTTGAC TAATTCTGGA ATTACAAGAT 
ATGTTTCCCA AACTGTGAGG AGGGAAGGCT 

75 CAAAATGGTG CTTTGAGGGT CAGCCTTTAG 
TCTGTTATGT GCCTATCCTA ATAAACTCTT 



TTTTTTTGAA AGACTGGACT GGAATCTTAG 2220 

CCCTTGGAAC ATGCAGGTAT TTTCTACGGG 2280 

GTTGGCGTGC CACGGTGGGG ATTTCGTGAC 2340 

TCAAATTCCA ATGTGACCAA TTCCGGATCA 24 00 

CCCTGAATAT CTTCTCTGCT CACTTCCACC 2460 

TGATTAGGAT TGAAATGATT TGTTTCTCTT 2520 

TCAATTAGGT AAGAAGATCT GGTTTTTTGG 2S80 

TTCTTTACAT TTGCATTCCT CCATTTCGCC 2640 

GACATCCTCC AGAATGGCCA GAAGTGCAAT 2700 

TGCCTCTTTA GTTCTTACAT TTCTAATAGC 2760 

AATATGAGAA AAGTTGCTTG AAGTGCATTA 2B20 

GGCTAGGGCG AGAGAGGCCA GGGATTTGTT 2880 

TACTGAGGTT ACCACACACT TGACTACGGA 2940 

AAGGACAACC TGTCTTTGAG CCAGGGCAGG 3000 

CACTTCATCC TGCCCGGAAT GCCAGTGCTC 3060 

GTTCCATCAG GCTGTTTCCT AAAGGATGTG 3120 

AGTTGTCATT TTAAAGCATT TTAGCACAGT 3180 

GATTTTAAAT CCTGAAGTGT GGGTGGCGCA 3240 

TGCTTAAGGA ACTTTTGTTC TCTGTCTCTT 3300 

AAGAGGGAAA GAGATGACTA ACTAAAATCA 3360 

TAAATTATAT CCTCATTTTA AAAGTTACAT 3420 

TCGATAGTGT GCACTCTTTC TCTCTCTCTC 3480 

ACACACACAC AGAGACACGG CACCATTCTG 3540 

CACCGATGGT CAGAGTCACT AGAAGTTACC 3600 

GTGGGCTTTT TACCACCACT GTGCAGGAGA 3660 

GCCCCAAAGC CTCAGAGAAA GGGTGTTTCT 3720 

AGGTGACACT CTGGAGTGGT TGAAGGGCCA 3780 

TTGAAATATA GATGCTATGG TTCAGATTGT 3840 

GTGAAAGGAA AGATGGAGGT TTTGTGCGGC 3900 

GTCATCTCAT GGTCTCCAGT TTTCAGTTGG 3960 

AAGGCTGTGT CCATTTGGCA AAACTTCCTT 4020 

TGGGCAGTCT GTGGTGTGGA GAGCAGCCAT 4080 

ATGGCTGGAT GCGCTGCTGA CCAACATCAG 4140 

CCCTCTGGGC CTTGAAAATC CTTGCCCTTA 4200 

TTGGCTTCCC ACAGCCCCAA CGCAGTCTGT 4260 

TATTTTCACA GTGTTCTGAT TGCTCTCACA 4320 

AGCCCTGTTC TCAACAACAG GGAGGTCATG 4380 

AATGGGTGAT AAAGAATCCA GTTCCTCAAA 4440 

GCTGGGTGGT GCTTTCTCTT GCACACCACT 4500 

CCAGACAGCT CAGGTTGTGC ATCTGATGGA 4560 

TCCTGTTTAT TTTTGCTGTT ACTTTGAAGC 4620 

CTACAGTAGT GGGTAAACAA ATGCCCACCG 4680 

CCTGAGGGGC CCCAGCTTGC TCGGGCGTGG 4740 

GGGAGAGGTG CACCCTGCCA CCTGCTAACT 4800 

GACCTGTTCA GCAGCAGAAC AAGCCAGGGC 4860 

ATGGACACCA AGACAATGAA GATTTGTTGT 4920 

TGGAAAAAAC TGAATATATT CAGGACCAAC 4980 

GCAGACAATG TAAAGAGAGA ACATCTCACA 5040 

AATTTCAGGT GCCATCACTG C TCTTTCTTT 5100 

GGACAATTCC CACAGATAAT CTGAGGAATG 5160 

AATGCATTAG AACTTGGTGA GCATCCTCTG 5220 

GGCTTCCAGG CAAGACCAAC CACATCTGGT 5280 

GTCATCGTCA TTGCCATAGC ATCATGATGC 5340 

TGTTTGGGTA ATGTGGGGAT GAACTGCATG 5400 

CTGCGGTCAC TGTOGGCCTT GCAAGGCCAC 5460 

GGAGCAAGGC TTCAGGAAGG GCCAGCCACA 5520 

TGGAATTAAA AAGTCAAACC TTTATATGCA 5580 

AGTTTAAAAG AATTTTTTTT TCTCTTCAAG 5640 

CCCTTGTTTG AGTGTCTTAT CCCTGAGCAA 5700 

ACTTTTGTTT CAACCTTTAT TCCTGTAAAT 5760 

AGATTTGACA AAAAAAAATT GAATTAAAAA 5820 

AAGCCTTTCC TATTATTTCT CTTCTTAGCT 5880 

AACACAGCTT TCTAGCAGCT GCAAAATGGT 5940 

GAACAATAGC TTTAGCTTGG GAATCTGAGA 6000 

OGGCTGCAAA ACCCATGGTT TAAATCTATA 6060 

AGATGATAAT CCGAATTCTC CATATATTCA 6120 

TCCTGAGACA AATACTCACT GAAGGGCTTG 6180 

TTGTAGATAA TGCCCTTCTA TTTTAGGTAG 6240 

CTCTTATCTG CAAGGTGGCA AGCAGTTCTT 6300 

GCTGAAGTTC TTTGCATAGA TTTGGCTTAA 6360 

CTGTGATGTC AGATGTAATT GTAAATGTCA 6420 

GTAAGTATTT TTAAATGTGT GTATTTCAAA 6480 

TTCTATGCAG GATTTACCTT CATCCTGTGC 6540 

CAGAGATCGA GCTT CTCCTC TGAGTTCTAA 6600 

GAAGGTGCAG CTTTGTTGTC CTTTGAGCTT 6660 
AAACACATT 
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Seq ID No: 131 Protein sequence: 
Protein Accession NPJ)36204.1 



1 11 21 

I I I 

MATSMGLLLL IALLLTQPGA GTGADTEAW 
ATVKSKEEAQ HVQRVLAQLL RREAALTARM 
EDTPYSNWHK ELRNSCISKR CVSIiLLDLSQ 
KFSFKGMCRP LALGGPGQVT YTTPFQTTSS 
KEKAPDVFDW GSSGPLCVSP KYGCNFNNGG 
ASRNPCSSSP CRGGATCVLG PHGKNYTCRC 
TPGGFRCECW VGYEPGGPGE GACQDVDECA 
DGTQCQDVDE CVGPGGPLCD SLCPNTQGSF 
PDEEDKGEKE GSTVPRAATA SPTRGPEGTP 
SGVWREPSIH HATAASGPQE PAGGDSSVAT 
LGLLVYRKRR AKREEKKEKK PQNAADSYSW 



31 41 51 

I I I 

CVGTACYTAH SGKLSAAEAQ NHCNQNGGNL 60 

SKFWIGLQRE KGKCLDPSLP LKGFSWVGGG 120 

PLLPNRLPKW SBGPCGSPGS PGSNIEGFVC 180 

SLEAVPFASA ANVACGEGDK DETQSHYFLC 240 

CHQDCFEGGD GSPIiCGCRPG FRLLDDIiVTC 300 

PQGYQLDSSQ LDCVDVDECQ DSPCAQECVN 360 

LGRSPCAQGC TNTDGSFHCS CEBGYVLAGE 420 

HCGCLPGWVL APNGVSCTMG PVSIX3PPSGP 480 

KATPTTSRPS LSSDAPITSA PLKMLAPSGS 540 

QNNDGTDGQK LLLFYILGTV VAILLLLALA 600 
VPERAESRAM ENQYSPTPGT DC 



Seq ID NO: 132 DNA sequence 

Nucleic Acid Accession #: NM_000963.1 

Coding sequence: 135-1949 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

CAATTGTCAT ACGACTTGCA GTGAGCGTCA 
CTCCTTCAGC TCCACAGCCA GACGCCCTCA 
TGCCCGCCGC TCGGATGCTC GCCCGCGCCC 
ATACAGCAAA TCCTTGCTGT TCCCACCCAT 
GATTTGACCA GTATAAGTGC GATTGTACCC 
CACCGGAATT TTTGACAAGA ATAAAATTAT 
ACATACTTAC CGACTTCAAG GGATTTTGGA 
ATGCAATTAT GAGTTATGTC TTGACATCCA 
ACAATGCTGA CTATGGCTAC AAAAGCTGGG 
GAGCCCTTCC TCCTGTGCCT GATGATTGCC 
AGCTTCCTGA TTCAAATGAG ATTGTGGAAA 
ATCCCCAGGG CTCAAACATG ATGTTTGCAT 
TCAAGACAGA TCATAAGCGA GGGCCAGCTT 
TAAATCATAT TTACGGTGAA ACTCTGGCTA 
GAAAAATGAA ATATCAGATA ATTGATGGAG 
AGGCAGAGAT GATCTACCCT CCTCAAGTCC 
AGGTCTTTGG TCTGGTGCCT GGTCTGATGA 
ACAGAGTATG CGATGTGCTT AAACAGGAGC 
AGACAAGCAG GCTAATACTG ATAGGAGAGA 
AACACTTGAG TGGCTATCAC TTCAAACTGA 
AATTCCAGTA CCAAAATCGT ATTGCTGCTG 
TTCTGCCTGA CACCTTTCAA ATTCATGACC 
ACAACTCTAT ATTGCTGGAA CATGGAATTA 
TTGCTGGCAG GGTTGCTGGT GGTAGGAATG 
CTTCCATTGA CCAGAGCAGG CAGATGAAAT 
TTATGCTGAA GCCCTATGAA TCATTTGAAG 
AGTTGGAAGC ACTCTATGGT GACATCGATG 
AAAAGCCTCG GCCAGATGCC ATCTTTGGTG 
CCTTGAAAGG ACTTATGGGT AATGTTATAT 
TTGGTGGAGA AGTGGGTTTT CAAATCATCA 
ATAACGTGAA GGGCTGTCCC TTTACTTCAT 
CAGTCACCAT CAATGCAAGT TCTTCCCGCT 
TACTAAAAGA ACGTTCGACT GAACTGTAGA 
GAACCATGTC TATTAATTTA ATTATTTAAT 
AACATCTTCT GTAACAGAAG TCAGTACTCC 
GACTTTTATG TCACTACTCT AAAGATTTTG 
TATTCTGTTT TATAAACCAG AGAGAAATGA 
TTATATTATA AGAACGAAAG TAAAGATGTT 
AAATGCTGAA AGTTTTTACA CTGTCGATGT 
GTAACTAATG TTTGAAATTT TAAAGTACTT 
AGGTATCAGT GCATTATTAA ATGAATATTT 
ACTTTTTAAA ATCAGCAATG AAACAATAAT 
ACCTGTAAAA GCTTGTTTGA TTTCTTAAAG 
GCTGTCTTGG ATTTAAATCT GTAAAATCAG 
TATTTTATAA GTGATGTTCC TTTTTCACCA 
AAACTTCCTT TTAAATCAAA ATGCCAAATT 
TCTCAAAATA AGAArATTTT GTTCAGATAT 
GTAAAATCTA TATCAGCAAA AGGGTCTACC 
CAAATTATTG TTCAAATTTA GGTTTAAACT 



31 41 51 

I I I 

GGAGCACGTC CAGGAACTCC TCAGCAGCGC 60 

GACAGCAAAG CCTACCCCCG CGCCGCGCCC 120 

TGCTGCTGTG CGCGGTCCTG GCGCTCAGCC 180 

GTCAAAACCG AGGTGTATGT ATGAGTGTGG 240 

GGACAGGATT CTATGGAGAA AACTGCTCAA 300 

TTCTGAAACC CACTCCAAAC ACAGTGCACT 360 

ACGTTGTGAA TAACATTCCC TTCCTTCGAA 420 

GATCACATTT GATTGACAGT CCACCAACTT 480 

AAGCCTTCTC TAACCTCTCC TATTATACTA 540 

CGACTCCCTT GGGTGTCAAA GGTAAAAAGC 600 

AATTGCTTCT AAGAAGAAAG TTCATCCCTG 660 

TCTTTGCCCA GCACTTCACG CATCAGTTTT 720 

TCACCAACGG GCTGGGCCAT GGGGTGGACT 780 

GACAGCGTAA ACTGOGCCTT TTCAAGGATG 840 

AGATGTATCC TCCCACAGTC AAAGATACTC 900 

CTGAGCATCT ACGGTTTGCT GTGGGGCAGG 960 

TGTATGCCAC AATCTGGCTG OGGGAACACA 1020 

ATCCTGAATG GGGTGATGAG CAGTTGTTCC 1080 

CTATTAAGAT TGTGATTGAA GATTATGTGC 1140 

AATTTGACCC AGAACTACTT TTCAACAAAC 1200 

AATTTAACAC CCTCTATCAC TGGCATCCCC 1260 

AGAAATACAA CTATCAACAG TTTATCTACA 1320 

CCCAGTTTGT TGAATCATTC ACCAGGCAAA 13 80 

TTCCACCCGC AGTACAGAAA GTATCACAGG 1440 

ACCAGTCTTT TAATGAGTAC CGCAAACGCT 1500 

AACTTACAGG AGAAAAGGAA ATGTCTGCAG 1560 

CTGTGGAGCT GTATCCTGCC CTTCTGGTAG 1620 

AAACCATGGT AGAAGTTGGA GCACCATTCT 1680 

GTTCTCCTGC CTACTGGAAG CCAAGCACTT -1740 

ACACTGCCTC AATTCAGTCT CTCATCTGCA 1800 

TCAGTGTTCC AGATCCAGAG CTCATTAAAA 1860 

CCGGACTAGA TGATATCAAT CCCACAGTAC 1920 

AGTCTAATGA TCATATTTAT TTATTTATAT 1980 

AATATTTATA TTAAACTCCT TATGTTACTT 2040 

TGTTGCGGAG AAAGGAGTCA TACTTGTGAA 2100 

CTGTTGCTGT TAAGTTTGGA AAACAGTTTT 2160 

GTTTTGACGT CTTTTTACTT GAATTTCAAC 2220 

TGAATACTTA AACACTATCA CAAGATGGCA 2280 

TTCCAATGCA TCTTCCATGA TGCATTAGAA 2340 

TTGGTTATTT TTCTGTCATC AAACAAAAAC 2400 

AAATTAGACA TTACCAGTAA TTTCATGTCT 2460 

TTGAAATTTC TAAATTCATA GGGTAGAATC 2520 

TTATTAAACT TGTACATATA CCAAAAAGAA 2580 

ATGAAATTTT ACTACAATTG CTTGTTAAAA 2640 

AGAGTATAAA CCTTTTTAGT GTGACTGTTA 2700 

TATTAAGGTG GTGGAGCCAC TGCAGTGTTA 2760 

TCCAGAATTT GTTTATATGG CTGGTAACAT 2820 

TTTAAAATAA GCAATAACAA AGAAGAAAAC 2880 

TTTGAAGCAA ACTTTTTTTT ATCCTTGTGC 2940 
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ACTGCAGGCC TGGTACTCAG ATTTTGCTAT 
ATAACGATAT GTTTTCTCAG A7TTTCTGTT 
TTGCAAAAGT AGCAATGACC TCATAAAATA 
CATTAATTTT ATCTCAGTCT TGAAGCCAAT 
ACCTGCATGC TGTTCCTTTT CTTTTCTTCT 
CTCATCACTT CGTTTCTCCT ATTTTGTTTT 
TGGACTCTGC CTATATTTTC TTACCTGAAC 
AGGACTGCTA TTTAGCTCCT CTTAAGAAGA 
AATAGTATAC ACTTATTTTA AGTGAAAAGC 
ATCTGTAACC AAGATGGATG CAAAGAGGCT 
TGACTGGAAA AAGTTACGTT CCCATTCTAA 
CAAATGATAT CTAAGTAGTT CTCAGCAATA 
ATCTCATTGT CACTGACATT TAATGGTACT 
TTTATGTCTT ATTAGGACAC TATGGTTATA 
TTTTTTGTTA TGTCACAATC AGTATATTTT 
AACAATCCAA AGAAATGATT GTATTAAGAT 
CATATTGAGA TATTTAAGGT TGAATGTTTG 
AAAGAATATT GTCTCATTAG CCTGAATGTG 
GGGATCTGTG GATGCTTCGT TAATTTGTTC 
TCAAGCACTG TGGGTTTTAA TATTTTTAAA 
TATAAATAAT TGAAAAAAAT TTTCTTTTGG 
AAAGATAACT CAGGAGAATC TTCTTTACAA 
AGAAATAGTC AATATGCTTG TATAAAACAC 
GATTTGTTAT TAACATTGAT CTGCTGACAA 
GTTTCAGTGC CTCAGACAAA TGTGTATTTA 
TGTCTGTTTA TTTTTGTACT ATTTA 



GAGGTTAATG AAGTACCAAG CTGTGCTTGA 3000 

GTACAGTTTA ATTTAGCAGT CCATATCACA 3060 

CCTCTTCAAA ATGCTTAAAT TCATTTCACA 3120 

TCAGTAGGTG CATTGGAATC AAGCCTGGCT 3180 

TTTAGCCATT TTGCTAAGAG ACACAGTCTT 3240 

ACTAGTTTTA AGATCAGAGT TCACTTTCTT 3300 

TTTTGCAAGT TTTCAGGTAA ACCTCAGCTC 3360 

TTAAAAGAGA AAAAAAAAGG CCCTTTTAAA 3420 

AGAGAATTTT ATTTATAGCT AATTTTAGCT 3480 

AGTGCCTCAG AGAGAACTGT ACGGGGTTTG 3540 

TTAATGCCCT TTCTTATTTA AAAACAAAAC 3600 

ATAATAATGA CGATAATACT TCTTTTCCAC 3660 

GTATATTACT TAATTTATTG AAGATTATTA 3720 

AACTGTGTTT AAGCCTACAA TCATTGATTT 3780 

CTTTGGGGTT ACCTCTCTGA ATATTATGTA 3840 

TTGTGAATAA ATTTTTAGAA ATCTGATTGG 3900 

TCCTTAGGAT AGGCCTATGT GCTAGCCCAC 3960 

CCATAAGACT GACCTTTTAA AATGTTTTGA 4020 

AGCCACAATT TATTGAGAAA ATATTCTGTG 4080 

TCAAAOGCTG ATTACAGATA ATAGTATTTA 4140 

GAAGAGGGAG AAAATGAAAT AAATATCATT 4200 

TTTTACGTTT AGAATGTTTA AGGTTAAGAA 4260 

TGTTCACTGT TTTTTTTAAA AAAAAAACTT 4320 

AACCTGGGAA TTTGGGTTGT GTATGCGAAT 4380 

ACTTATGTAA AAGATAAGTC TGGAAATAAA 4440 



Seq ID No: 133 Protein sequence: 
Protein Accession #: NP_000954.1 



1 11 21 31 41 51 

I I I I I I 

MLARAIiLlLCA vlalshtanp CCSHPCQNRG VCMSVGFDQY KCDCTRTGFY GENCSTPEFL 60 

TRIKLFLKPT PNTVHYILTH FKGFWNWNN IPFLRKAIMS YVLTSRSHLI DSPPTYNADY 120 

GYKSWEAFSN LSYYTRALPP VPDDCPTPIiG VKGKKQLPDS NEIVEKLLLR RKFIPDPQGS 180 

NMMFAFFAQH FTHQPFKTDH KRGPAFTNGL GHGVDLNHIY GETLARQRKL RLFKDGKMKY 240 

QIIDGEMYPP TVKDTQAEMI YPPQVPEHLR FAVGQEVFGL VPGLMMYATI WLREHNRVCD 300 

VliKQEHPEWG DEQLFQTSRL ILIGETIKIV IEDYVQHLSG YHFKLKFDPE LLFNKQPQYQ 360 

NRIAAEFNTL YHWHPLLPDT PQIHDQKYNY QQFIYNNSIL LEHGITQFVE SFTRQIAGRV 420 

AGGRNVPPAV QKVSQASIDQ SRQMKYQSFN EYRKRFMLKP YESFEELTGE KEMSAELEAL 480 

YGDIDAVELY PALLVEKPRP DAIFGETMVE VGAPFSLKGL MGNVICSPAY WKPSTFGGEV 540 

GFQIINTASI QSLICNNVKG CPFTSFSVPD PELIKTVTIN ASSSRSGLDD INPTVLIiKER 600 
STEL f 



Seq ID NO: 134 DNA sequence 

Nucleic Acid Accession #: XM_059648.1 

Coding sequence: 35-664 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

111)11 

AGGCTGCTGA GACTTCCCTC TAGAATCCTC CAACATGGAG CCTCTTGCAG CTTACCCGCT 60 

AAAATGTTCC GGGCCCAGAG CAAAGGTATT TGCAGTTTTG CTGTCTATAG TTCTATGCAC 120 

AGTAACGCTA TTTCTTCTAC AACTAAAATT CCTCAAACCT AAAATCAACA GCTTTTATGC 180 

CTTTGAAGTG AAGGATGCAA AAGGAAGAAC TGTTTCTCTG GAAAAGTATA AAGGCAAAGT 240 

TTCACTAGTT GTAAACGTGG CCAGTGACTG CCAACTCACA GACAGAAATT ACTTAGGGCT 300 

GAAGGAACTG CACAAAGAGT TTGGACCATC CCACTTCAGC GTGTTGGCTT TTCCCTGCAA 360 

TCAGTTTGGA GAATCGGAGC CCCGCCCAAG CAAGGAAGTA GAATCTTTTG CAAGAAAAAA 420 

CTACGGAGTA ACTTTCCCCA TCTTCCACAA GATTAAGATT CTAGGATCTG AAGGAGAACC 480 

TGCATTTAGA TTTCTTGTTG ATTCTTCAAA GAAGGAACCA AGGTGGAATT TTTGGAAGTA 54 0 

TCTTGTCAAC CCTGAGGGTC AAGTTGTGAA GTTCTGGAAG CCAGAGGAGC CCATTGAAGT 600 

CATCAGGCCT GACATAGCAG CTCTGGTTAG ACAAGTGATC ATAAAAAAGA AAGAGGATCT 660 

ATGAGAATGC CATTGCGTTT CTAATAGAAC AGAGAAATGT CTCCATGAGG GTTTGGTCTC 720 

ATTTTAAACA TTTTTTTTTT GGAGACAGTG TCTCACTCTG TCACCCAGGC TGGAGTGCAG 780 

TAGTGCGTTC TCAGCTCATT GCAACCTCTG CCTTTTTAAA CATGCTATTA AATGTGGCAA 840 

TGAAGGATTT TTTTTTAATG TTATCTTGCT ATTAAGTGGT AATGAATGTT CCCAGGATGA 900 

GGATGTTACC CAAAGCAAAA ATCAAGAGTA GCCAAAGAAT CAACATGAAA TATATTAACT 960 

ACTTCCTCTG ACCATACTAA AGAATTCAGA ATACACAGTG ACCAATGTGC CTCAATATCT 1020 

TATTGTTCAA CTTGACATTT TCTAGGACTG TACTTGATGA AAATGCCAAC ACACTAGACC 1080 

ACTCTTTGGA TTCAAGAGCA CTGTGTATGA CTGAAATTTC TGGAATAACT GTAAATGGTT 1140 

ATGTTAATGG AATAAAACAC AAATGTTGAA AAATGTAAAA TATATATACA TAGATTCAAA 1200 

TCCTTATATA TGTATGCTTG TTTTGTGTAC AGGATTTTGT TTTTTCTTTT TAAGTACAGG 1260 

TTCCTAGTGT TTTACTATAA CTGTCACTAT GTATGTAACT GACATATATA AATAGTCATT 1320 
TATAAATGAC CGTATTATAA CA 
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Seq ID No: 135 Protein sequence: 
Protein Accession #: XP_059648.1 



1 11 21 31 41 51 

I I I I I I 

MEPIAAYPLK CSGPRAKVFA VlaliSIVLCTV TLFLLQLKFL KPKINSFYAF EVKDAKGRTV 60 
SLEKYKGKVS LWNVASDCQ LTDRNYLGLK ELHKEFGPSH FSVLAFPCNQ FGESEPRPSK 120 
EVESFARKNY GVTFPIFHKI KILGSEGEPA FRFLVDSSKK EPRWNFWKYL VNPEGQWKF 180 
WKPEEPIEVI RPDIAALVRQ VIIKKKEDL 



Seq ID NO: 136 DMA sequence 

Nucleic Acid Accession #: NH_003 003.1 

Coding sequence: 304-2451 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CAAGTGCCGT CGCCGCGCCC CTTCCCCCTC CCGCCTCCCC GGCCCCCTCC CCG6AACC66 60 

CGGTCGAGCT ACGGTCGCGG ACGAGT6GAA CCGAGACTGC CCCGCGGAGC CGCCGGTATG 120 

AGCGCCCCTC GCCACCCCGT GTCCCAGGCC CGGCCTTTCT GACAAGAGCT AGACTTCGGG 180 

CTCCTTGAGG ATATTCAGTT TTGTATGTTT GAATATCCTC TCACCATGTT CAGCATAAAG 240 

TACCATTCTT AATGATTATC CTCAACAAGA CAGGTGTGAG AGGGTTGCTG TTGCATTGCA 300 

ATCATGGTGC AAAAATACCA GTCCCCAGTG AGAGTGTACA AATACCCCTT TGAATTAATT 360 

ATGGCTGCCT ATGAAAGGAG GTTCCCTACA TGTCCTTTGA TTCCGATGTT CGTGGGCAGT 420 

GACACTGTGA GTGAATTCAA GAGCGAAGAT GGGGCTATTC ATGTCATTGA AAGGCGCTGC 480 

AAGCTGGATG TAGATGCACC CAGACTGCTG AAGAAGATTG CAGGAGTTGA TTATGTTTAT 540 

TTTGTCCAGA AAAACTCACT GAATTCTCGG GAACGTACTT TGCACATTGA GGCTTATAAT 600 

GAAACGTTTT CCAATCGGGT CATCATTAAT GAGCATTGCT GCTACACCGT TCACCCTGAA 660 

AATGAAGATT GGACCTGTTT TGAACAGTCT GCAAGTTTAG ATATTAAATC T TT CT TT GGT 720 

TTTGAAAGTA CAGTGGAAAA AATTGCAATG AAACAATATA CCAGCAACAT TAAAAAAGGA 780 

AAGGAAATCA TCGAATACTA CCTTCGCCAA TTAGAAGAAG AAGGCATAAC CTTTGTGCCC 840 

CGTTGGAGTC CGCCTTCCAT CACGCCCTCT TCAGAGACAT CTTCATCATC CTCCAAGAAA 900 

CAAGCAGCGT CCATGGCCGT CGTCATCCCA GAAGCTGCCC TCAAGGAGGG GCTGAGTGGT 960 

GATGCCCTCA GCAGCCCCAG TGCACCTGAG CCCGTGGTGG GCACCCCTGA CGACAAACTA 1020 

GATGCCGACC ACATCAAGAG ATACCTGGGC GATTTGACTC CGCTGCAGGA GAGCTGCCTC 1080 

ATTAGACTTC GCCAGTGGCT CCAGGAGACC CACAAGGGCA AAATTCCAAA AGATGAGCAT 1140 

ATTCTTCGGT TCCTCCGTGC ACGGGATTTT AATATTGACA AAGCCAGAGA GATCATGTGT 1200 

CAGTCTTTGA CGTGGAGAAA GCAGCATCAG GTAGACTACA TTCTTGAAAC CTGGACCCCT 1260 

CCTCAGGTCC TTCAGGATTA CTACGCGGGA GGCTGGCATC ATCACGACAA AGATGGGCGG 1320 

CCOCTCTACG TGCTCAGGCT GGGGCAGATG GACACCAAAG GCTTGGTGAG AGCGCTCGGG 1380 

GAGGAAGCCC TGCTGAGATA CGTTCTCTCC GTAAATGAAG AACGGCTAAG GCGATGCGAA 1440 

GAGAATACAA AAGTCTTTGG TCGGCCTATC AGCTCATGGA CCTGCCTGGT GGACTTGGAA 1500 

GGGCTGAACA TGCGCCACTT GTGGAGACCT GGTGTGAAAG CGCTGCTGCG GATCATCGAG 1560 

GTGGTGGAGG CCAACTACCC TGAGACACTG GGCCGCCTTC TCATCCTGCG GGCGCCCAGG 1620 

GTATTTCCTG TGCTCTGGAC GCTGGTTAGT CCGTTCATTG ATGACAACAC CAGAAGGAAG 1680 

TTCCTCATTT ATGCAGGAAA TGACTACCAG GGTCCTGGAG GCCTGCTGGA TTACATCGAC 1740 

AAAGAGATTA TTCCAGATTT CCTGAGTGGG GAGTGCATGT GCGAAGTGCC AGAGGGTGGA 1800 

CTGGTCCCCA AATCTCTGTA CCGGACTGCA GAGGAGCTGG AGAACGAAGA CCTGAAGCTC 1860 

TGGACTGAGA CCATCTACCA GTCTGCAAGC GTCTTCAAAG GAGCCCCACA TGAGATTCTC 1920 

ATTCAGATTG TGGATGCCTC GTCAGTCATC ACTTGGGATT TCGACGTGTG CAAAGGGGAC 1980 

ATTGTGTTTA ACATCTATCA CTCCAAGAGG TCGCCACAAC CACCCAAAAA GGACTCCCTG 2040 

GGAGCCCACA GCATCACCTC TCCGGGTGGG AACAATGTGC AGCTCATAGA CAAAGTCTGG 2100 

CAGCTGGGCC GCGACTACAG CATGGTGGAG TCGCCTCTGA TCTGCAAAGA AGGAGAAAGC 2160 

GTGCAGGGTT CCCATGTGAC CAGGTGGCCG GGCTTCTACA TCCTGCAGTG GAAATTCCAC 2220 

AGCATGCCTG CGTGCGCCGC CAGCAGCCTT CCCCGGGTGG ACGACGTGCT TGCGTCCCTG 2280 

CAGGTCTCTT CGCACAAGTG TAAAGTGATG TACTACACCX3 AGGTGATCGG CTCGGAGGAT 2340 

TTCAGAGGTT CCATGACGAG CCTGGAGTCC AGCCACAGCG GCTTCTCCCA GCTGAGTGCC 2400 

GCCACCACCT CCTCCAGCCA GTCCCACTCC AGCTCCATGA TCTCCAGGTA_GTGCCGCGCT 2460 

GCCTGCACCT AGTGTGCAGA GGGGACGGCC GCCCCTCCTC GGACAGCAGC TGCACCOGCC 2520 

CACCCAGCGG CGACATTGTA CAGACTCCTC TCACCTCTAG ATAGCAAATA GCTCTCAGAT 2580 

GGTAAACGTA GTCGTTTGAT CCCAAAACTA CCTTGGCAGG TAGTTTTAAC TCTGATCCTA 2640 

ACTTAACTCA ATAGCCATAG ATTTTGTATA CGTTGTGCAC AAAATCCAAC CAGAGCGCAA 2700 

GGGCTCTCTT GAAAGAAAAG TAGTTTCTGT ACCAATTAAA GGATTGACGT GGTCTCAGAT 2760 

ATTGATGCAA AAAATTTTTC CAACGAACTC CGCATTGTCC ATTAGTGAAT GAATTCCTGT 2820 

GACATCCTCC AGAGATGGCC CCTCCTCACC TGGGACGGAA GCTGCCAGCT CGCTTCCCCC 2880 

AAGCTGCCTC ATGGCCCGCA CGCCGCCTCA CGGCCCCCAT GCTTCCCGCC AGTCAAGATG 2940 

GTCTGTGGAC TTAGGGCCAG CCCTTGAGGT CCTTATCCTC TGAGGATTCA GAGGTTGCCT 3000 

GCGGAGTACC TTGTCCCAGG GCCAGACACA CCCACACCAC CCACTGTCTG CAGTGGGGCC 3060 

GGGGGCTCAG GAGGGGCTCT CAGGGACTCC TGGTGACTCC AGGAAAATGC TGCCATCGTT 3120 

AAACATTACT TTCTCTTTCC TCCTTTTCAA ATCTTTTTGA TACTTTTTAG AGCAGGATTT 3180 

TTCTGTATGT GAACTTGGGT GGGGGGGTTC TTCCCGTTTC CTTCCGTGCG TOGCCCCTCT 3240 

CACCTGCAGT CAGCTCCCAG CCCAGTGTAG GCCATCTCCT CTGTGCCCTC TGGAGGCTCA 3300 

TTGTCTCAGA GCCCAGACAG TTCCAGCCAC TAGGAGGCOG TCTTGGAACC AGCAAGTCGC 3360 ' 

ATTTGCCACT TGACACTGTC CATGGGGTTT TATTAGTAGC TAAGCAGCAG CTCTCGCATC 3420 
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CACTTCAGGG TGGCGTGTGG CATGTAGGAG TCCTGCTTCT TTGTACATGG GAATTGTGGA 3480 

CTCATGCGTG TGTGTGTGTG CATGTGCTGT GTGTGTGCAT GTGTGCATGA CGGTGGGGGT 3540 

GCTGGGGGGA CGGGGTGAGT GGAAACTTAG TTTGAGTAAT GAAGGAATCT TCACAGAAGC 3600 

AAATCAGAAT ATGGGATTTG TTTGCCTTTT ACATTTTGTT TAATTCCTGA TTTTAAAGCC 3660 

TGCTCTATCT GGTACAGGCC CTTATTTTTT CAGCTTTTTA TGGGAAAAGC AGGTTATTTG 3720 

AGAATCTGTC CAGAAGTTGC ATAGGGGATG GCCTCCACGA TAAGGACATG CAACACGTGT 3780 

TTCTGTGTGC AGCAGAGGCC GTGTTTTTCA TGCCAAACCC CACGCGGCTG TCAACTGTGT 3840 

GCGTGGTAGG CATGGAGATC CTGGTTGTGC CGTCTCAGCT CCGCTCTGAA GGCACTGTGT 3900 

GGGTGCTGCG TGACTGGAGA GCTGTGTGGA GGCCATGTGT GCCCCGTGCA GGGATCAGGA 3960 

GGGCGGGGGA GGGACCGAGC AGCCCTCTTG CCCGGTCGGG TCAGCCCTAG TGGCTGCCTG 4020 

CACACTGTAG ACGTCCCAGG GCCTGTGCTG TGATCACCTG CCTTTGGACC ACATTTGTGT 4080 

TTGCTCTTAG AGATCGAGCT CCTCAGTGGT ACCTGAAGCC TTTGCTTCCG GAAAGCGCGG 4140 

TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA 4200 

GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG 4260 

GGCTAGTAGG TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT AGGGCTAGTA 4320 

GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG TAGGTAGGGT 4380 

TAGTAGGTAG GGCTAGTAGG TAGGGTTCGT AGGTAGGGCT AGTAGGTAGG GTTAGTAGGT 4440 

AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG GTAGGGCTGG 4500 

TAGGTAGGGT TAGTAGGTAG GGCTAGTAGG TAGGGCTAGT AGGTAGGGCT AGTAGGTAGG 4560 

GTTAGTAGGT AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG TTAGTAGGTA GGGTTCGTAG 4620 

GTAGGGCTGG TAGGTAGGGT TAGTAGGTAG GGCTAGTAGG TAGGGCTAGT AGGTAGGGCT 4680 

AGTAGGTAGG GCTAGTAGGT AGGGCTAGTA GGTAGGGCTA GTAGGTAGGG CTAGTAGGTA 4740 

GGGTTCGTAG GTAGGGTTCG TAGGTAGGGT TCGTAGGTAG GGTTAGTAGC GCGTCTGTGC 4800 

TGCTTCCACC TGGTGCTTCC TGTTCCCAAA TCACAAGGGC CTGAAGGTGG TCCCTGCTTT 4860 

CTCTTTCTCT TTCTCTGTGT CTCAGATGGC GATTTTGCTG ACAGCTGCCA AGAAAATGCT 4920 

TCACTCAACA GTCCTCATGT GCCCAGAGAT GTTTATAGAA CTGTTTGAAT TGCAGCCATC 4980 

CCCTGCCCCC TCCCAGGCTG AAGATCTGTT CTTTTTAAGT TGATTCGGGA GTGGCATTCT 5040 

TTTATACCCA AAGACTGTAG TGCATCTTGA AGAGCTCAAA GCACATGACC GCACAAATGC 5100 

TTACAGGGTT TCCTCCCGAG TAATCCAATC TCACTCCCCT TGTAAGGGAA TTCTGGGGCA 5160 

GCTATGGTTT GAGTATGCAG TTTGCATCGT GTTTCTACCT TTAGTACCTT GCCACTCTTT 5220 

TAAAACGCTG CTGTCATTTC CCATTTCTTA GTACTAATGA TTCTTTGATT CTCCCTCTAT 5280 

TATGTCTTAA TTCACTTTCC TTCCTAAATT TGTTATTTGC ATATCAAATT CTGTAAATGT 5340 

TTTGTAAACA TATTACCTCA CTTGGTAATA CAATACTGAT AGTCTTTAAA AGATTTTTTT 5400 
ATTGTTATCA ATAATAAATG TGAACTATTT AAAG 

Seq ID No: 137 Protein sequence; 
Protein Accession #: NP 002994.1 



1 11 21 31 41 51 

I I I 1 I I 

MVQKYQSPVR VYKYPFELIM AAYERRFPTC PLIPMFVGSD TVSEPKSEDG AIHVTERRCK 60 

LDVDAPRLLK KIAGVDYVYF VQKNSLNSRE RTLHIEAYNE TFSNRVIINE HCCYTVHPEN 120 

EDWTCFEQSA SLDIKSFFGF ESTVEKIAMK QYTSNIKKGK EIIEYYLRQL EEEGITFVPR 180 

WSPPSITPSS ETSSSSSKKQ AASMAWIPE AALKEGLSGD ALSSPSAPEP WGTPDDKLD 240 

ADHIKRYLGD LTPLQESCLI RLRQWLQETH KGKIPKDEHI LRFLRARDFN IDKAREIMCQ 300 

SLTWRKQHQV DYILETWTPP QVLQDYYAGG WHHHDKDGRP LYVLRLGQMD TKGLVRALGE 360 

EALiLRYVLSV NEERLRRCEE NTKVFGRPIS SWTCLVDLEG LNMRHLWRPG VKALLRIIEV 420 

VEANYPETLG RLLILRAPRV FPVLWTLVSP FIDDNTRRKF LIYAGNDYQG PGGLIiDYIDK 480 

EIIPDFLSGE CMCEVPEGGL VPKSLYRTAE ELENEDLKLW TETIYQSASV FKGAPHEILI 540 

QIVDASSVIT WDFDVCKGDI VFNIYHSKRS PQPPKKDSLG AHSITSPGGN NVQLIDKVWQ 600 

LGRDYSMVES PLICKEGESV QGSHVTRWPG FYILQWKFHS MPACAASSLP RVDDVLASLQ 660 
VSSHKCKVMY YTEVIGSEDF RGSMTSLESS HSGFSQLSAA TTSSSQSHSS SMISR 

Seq ID NO: 138 DNA sequence 

Nucleic Acid Accession #: NM_004181.1 

Coding sequence: 32-670 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I 1 I I 

GCAGAAATAG CCTAGGGAGA TCAACCCCGA GATGCTGAAC AAAGTGCTGT CCCGGCTGGG 60 

GGTCGCCGGC CAGTGGCGCT TCGTGGACGT GCTGGGGCTG GAAGAGGAGT CTCTGGGCTC 120 

GGTGCCAGCG CCTGCCTGCG CGCTGCTGCT GCTGTTTCCC CTCACGGCCC AGCATGAGAA 180 

CTTCAGGAAA AAGCAGATTG AAGAGCTGAA GGGACAAGAA GTTAGTCCTA AAGTGTACTT 240 

CATGAAGCAG ACCATTGGGA ATTCCTGTGG CACAATCGGA CTTATTCACG CAGTGGCCAA 300 

TAATCAAGAC AAACTGGGAT TTGAGGATGG ATCAGTTCTG AAACAGTTTC TTTCTGAAAC 360 

AGAGAAAATG TCCCCTGAAG ACAGAGCAAA ATGCTTTGAA AAGAATGAGG CCATACAGGC 420 

AGCCCATGAT GCCGTGGCAC AGGAAGGCCA ATGTCGGGTA GATGACAAGG TGAATTTCCA 4 B0 

TTTTATTCTG TTTAACAACG TGGATGGCCA CCTCTATGAA CTTGATGGAC GAATGCCTTT 540 

TCCGGTGAAC CATGGCGCCA GTTCAGAGGA CACCCTGCTG AAGGACGCTG CCAAGGTGTG 600 

CAGAGAATTC ACCGAGCGTG AGCAAGGAGA AGTCCGCTTC TCTGCCGTGG CTCTCTGCAA 660 

GGCAGCCTAA TGCTCTGTGG GAGGGACTTT GCTGATTTCC CCTCTTCCCT TCAACATGAA 720 

AATATATACC CCCCATGCAG TCTAAAATGC TTCAGTACTT GTGAAACACA GCTGTTCTTC 780 

TGTTCTGCAG ACACGCCTTC CCCTCAGCCA CACCCAGGCA CTTAAGCACA AGCAGAGTGC 840 

ACAGCTGTCC ACTGGGCCAT TGTGGTGTGA GCTTCAGATG GTGAAGCATT CTCCCCAGTG 900 
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TATGTCTTGT ATCCGATATC TAACGCTTTA AATGGCTACT TTGGTTTCTG TCTGTAAGTT 960 
AAGACCTTGG ATGTGGTTAT GTTGTCCTAA AGAATAAATT TTGCTGATAG TAGC 



Seq ID No: 139 Protein sequence: 
Protein Accession #: NP 004172.1 



1 11 21 31 41 

I I I I I 

MLNKVLSRLG VAGQWRFVDV LGLEEESLGS VPAPACALLL LFPLTAQHEN 
GQEVSPKVYF MKQTIGNSCG TIGLIHAVAN NQDKLGFEDG SVLKQFLSET 
CFEKNEAIQA AHDAVAQEGQ CRVDDKVNFH FILFNNVDGH LYELDGRMPF 
TLLKDAAKVC RBFTEREQGB VRFSAVALCK AA 

Seq ID NO: 140 DNA sequence 
Nucleic Acid Accession #: NM_000201.1 
Coding sequence: 58-1656 (underlined sequences correspond to start and stop codons) 



51 
I 

FRKKQIEELK 60 
BKMSPEDRAK 120 
PVNHGASSED 180 



1 11 21 31 41 51 

I I I I I I 

GCGCCCCAGT CGACGCTGAG CTCCTCTGCT ACTCAGAGTT GCAACCTCAG CCTCGCTATG 60 

GCTCCCAGCA GCCCCCGGCC CGCGCTGCCC GCACTCCTGG TCCTGCTCGG GGCTCTGTTC 120 

CCAGGACCTG GCAATGCCCA GACATCTGTG TCCCCCTCAA AAGTCATCCT GCCCCGGGGA 180 

GGCTCCGTGC TGGTGACATG CAGCACCTCC TGTGACCAGC CCAAGTTGTT GGGCATAGAG 240 

ACCCCGTTGC CTAAAAAGGA GTTGCTCCTG CCTGGGAACA ACCGGAAGGT GTATGAACTG 300 

AGCAATGTGC AAGAAGATAG CCAACCAATG TGCTATTCAA ACTGCCCTGA TGGGCAGTCA 360 

ACAGCTAAAA CCTTCCTCAC CGTGTACTGG ACTCCAGAAC GGGTGGAACT GGCACCCCTC 420 

CCCTCTTGGC AGCCAGTGGG CAAGAACCTT ACCCTACGCT GCCAGGTGGA GGGTGGGGCA 480 

CCCCGGGCCA ACCTCACCGT GGTGCTGCTC CGTGGGGAGA AGGAGCTGAA ACGGGAGCCA 540 

GCTGTGGGGG AGCCCGCTGA GGTCACGACC ACGGTGCTGG TGAGGAGAGA TCACCATGGA 600 

GCCAATTTCT CGTGCCGCAC TGAACTGGAC CTGCGGCCCC AAGGGCTGGA GCTGTTTGAG 660 

AACACCTCGG CCCCCTACCA GCTCCAGACC TTTGTCCTGC CAGCGACTCC CCCACAACTT 720 

GTCAGCCCCC GGGTCCTAGA GGTGGACACG CAGGGGACCG TGGTCTGTTC CCTGGACGGG 780 

CTGTTCCCAG TCTCGGAGGC CCAGGTCCAC CTGGCACTGG GGGACCAGAG GTTGAACCCC 840 

ACAGTCACCT ATGGCAACGA CTCCTTCTGG GCCAAGGCCT CAGTCAGTGT GACCGCAGAG 900 

GACGAGGGCA CCCAQCGGCT GACGTGTGCA GTAATACTGG GGAACCAGAG CCAGGAGACA 960 

CTGCAGACAG TGACCATCTA CAGCTTTCCG GCGCCCAACG TGATTCTGAC GAAGCCAGAG 1020 

GTCTCAGAAG GGACCGAGGT GACAGTGAAG TGTGAGGCCC ACCCTAGAGC CAAGGTGACG 1080 

CTGAATGGGG TTCCAGCCCA GCCACTGGGC CCGAGGGCCC AGCTCCTGCT GAAGGCCACC 1140 

CCAGAGGACA ACGGGCGCAG CTTCTCCTGC TCTGCAACCC TGGAGGTGGC CGGCCAGCTT 1200 

ATACACAAGA ACCAGACCCG GGAGCTTCGT GTCCTGTATG GCCCCCGACT GGACGAGAGG 1260 

GATTGTCCGG GAAACTGGAC GTGGCCAGAA AATTCCCAGC AGACTCCAAT GTGCCAGGCT 1320 

TGGGGGAACC CATTGCCCGA GCTCAAGTGT CTAAAGGATG GCACTTTCCC ACTGCCCATC 1380 

GGGGAATCAG TGACTGTCAC TCGAGATCTT GAGGGCACCT ACCTCTGTCG GGCCAGGAGC 1440 

ACTCAAGGGG AGGTCACCCG CGAGGTGACC GTGAATGTGC TCTCCCCCCG GTATGAGATT 1500 

GTCATCATCA CTGTGGTAGC AGCCGCAGTC ATAATGGGCA CTGCAGGCCT CAGCACGTAC 1560 

CTCTATAACC GCCAGCGGAA GATCAAGAAA TACAGACTAC AACAGGCCCA AAAAGGGACC 1620 

CCCATGAAAC CGAACACACA AGCCACGCCT CCCTGAACCT ATCCCGGGAC AGGGCCTCTT 1680 

CCTCGGCCTT CCCATATTGG TGGCAGTGGT GCCACACTGA ACAGAGTGGA AGACATATGC 1740 

CATGCAGCTA CACCTACCGG CCCTGGGACG CCGGAGGACA GGGCATTGTC CTCAGTCAGA 1800 

TACAACAGCA TTTGGGGCCA TGGTACCTGC ACACCTAAAA CACTAGGCCA CGCATCTGAT 1860 

CTGTAGTCAC ATGACTAAGC CAAGAGGAAG GAGCAAGACT CAAGACATGA TTGATGGATG 1920 

TTAAAGTCTA GCCTGATGAG AGGGGAAGTG GTGGGGGAGA CATAGCCCCA CCATGAGGAC 1980 

ATACAACTGG GAAATACTGA AACTTGCTGC CTATTGGGTA TGCTGAGGCC CACAGACTTA 2040 

CAGAAGAAGT GGCCCTCCAT AGACATGTGT AGCATCAAAA CACAAAGGCC CACACTTCCT 2100 

GACGGATGCC AGCTTGGGCA CTGCTGTCTA CTGACCCCAA CCCTTGATGA TATGTATTTA 2160 

TTCATTTGTT ATTTTACCAG CTATTTATTG AGTGTCTTTT ATGTAGGCTA AATGAACATA 2220 

GGTCTCTGGC CTCACGGAGC TCCCAGTCCA TGTCACATTC AAGGTCACCA GGTACAGTTG 2280 

TACAGGTTGT ACACTGCAGG AGAGTGCCTG GCAAAAAGAT CAAATGGGGC TGGGACTTCT 2340 

CATTGGCCAA CCTGCCTTTC CCCAGAAGGA GTGATTTTTC TATCGGCACA AAAGCACTAT 2400 

ATGGACTGGT AATGGTTCAC AGGTTCAGAG ATTACCCAGT GAGGCCTTAT TCCTCCCTTC 2460 

CCCCCAAAAC TGACACCTTT GTTAGCCACC TCCCCACCCA CATACATTTC TGCCAGTGTT 2520 

CACAATGACA CTCAGCGGTC ATGTCTGGAC ATGAGTGCCC AGGGAATATG CCCAAGCTAT 2580 

GCCTTGTCCT CTTGTCCTGT TTGCATTTCA CTGGGAGCTT GCACTATTGC AGCTCCAGTT 2640 

TCCTGCAGTG ATCAGGGTCC TGCAAGCAGT GGGGAAGGGG GCCAAGGTAT TGGAGGACTC 2700 

CCTCCCAGCT TTGGAAGGGT CATCCGCGTG TGTGTGTGTG TGTATGTGTA GACAAGCTCT 2760 

CGCTCTGTCA CCCAGGCTGG AGTGCAGTGG TGCAATCATG GTTCACTGCA GTCTTGACCT 2820 

TTTGGGCTCA AGTGATCCTC CCACCTCAGC CTCCTGAGTA GCTGGGACCA TAGGCTCACA 2880 

ACACCACACC TGGCAAATTT GATTTTTTTT TTTTTTTTCA GAGACGGGGT CTCGCAACAT 2940 
TGCCCAGACT TCCTTTGTGT TAGTTAATAA AGCTTTCTCA ACTGCC 

Seq ID No: 141 Protein sequence: 
Protein Accession #: NP 000192.1 
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I I I I I I 

MLQFVRAGAR AWLRPTGSQG LSS1AEEAAR ATBNPEQVAS EGLPEPVLRK VELPVPTHRR 60 
PVQAWVESLR GFEQERVGLA DLHPDVFATA PRLDILHQVA MWQKNFKRIS YAKTKTRAEV 120 
RGGGGKPLAA ERHWAGPAWQ HPLSALARRR CCPWPPGPTS YYYMLPMKVR ALGLKVALTV 180 
KLAQDDLHIM DSLELPTGDP QYLTEIAHYR RWGDSVLLVD LTHEEMPQSI VEATSRLKTP 240 
NLIPAVGLNV HSMLKHQTLV LTLPTVAFXiE DKLLWQDSRY RPLYPPSLPY SDFPRPLPHA 300 
TQGPAATPYH C 

Seq ID NO: 142 DMA sequence 

Nucleic Acid Accession #: NM_000270.1 

Coding sequence: 110-979 (underlined sequences correspond to start and stop codons) 



AACTGTGCGA 
GGATCGGAGC 
ATACACCTAT 
ACCTCAAGTT 
GGCCCAGATC 
TGCTGGCCGA 
GTTCCACATG 
CCTTCTGGGT 
TGAGGTTGGA 
GAACCCTCTC 
TGCCTACGAC 
ACGTGAGCTA 
AGAATGTCGT 
AGTTATCGTT 
GGTCATCATG 
CAAACAAGCT 
CCCTGACAAA 
GTAGCTGCTA 
CAGAAAGGAA 
TGCCAGATCC 
ACAAAATAAA 
ACCACACATC 
TGCTACTAGC 
CCAGAGACCA 



11 

I 

ACCAGACCCG 
ACACCGGAGC 
GAAGATTATA 
GCAATAATCT 
TTTGACTACA 
CTGGTGTTTG 
TATGAAGGGT 
GTGGACACCC 
GATATCATGC 
AGAGGGCCCA 
CGGACTATGA 
CAGGAAGGCA 
GTGCTGCAGA 
GCACGGCACT 
GATTATGAAA 
GCACAGAAAT 
GCCAGTTGAC 
CCTTCTTTGG 
AAGATTCCTG 
TCTTCTCAAA 
GCTGTTCTCA 
TGTGGAGATG 
TCTTTGAGAT 
AACAAGGACT 



21 
I 

GCAGCCTTGC 
AGGCTCATCG 
AGAACACTGC 
GTGGTTCTGG 
GTGAAATCCC 
GGTTCCTGAA 
ACCCACTCTG 
TGGTAGTCAC 
TGATCCGTGA 
ATGATGAAAG 
GGCAGAGGGC 
CCTATGTGAT 
AGCTGGGAGC 
GTGGACTTCG 
GCCTGGAGAA 
TGGAACAGTT 
CTGCCTTGGA 
CCCCTTGCTG 
TCCTTCACCT 
GCTGGGATTA 
TTCCTGTTCT 
CCCAGGATTT 
AATACATTCC 
AATCCAATAC 



31 
I 

TCAGTTCAGC 
AGAAGGCGTC 
AGAATGGCTT 
ATTAGGAGGT 
CAACTTTCCT 
TGGCAGGGCC 
GAAGGTGACA 
CAATGCAGCA 
CCATATCAAC 
GTTTGGAGAT 
TCTCAGTACC 
GGTGGCAGGC 
AGACGCTGTT 
AGTCTTTGGC 
GGCCAACCAT 
TGTCTCCATT 
GTCGTCTGGC 
GAGTCATGTG 
TTCCCACTTT 
CAGGTGTGAG 
TTCTTACACA 
GACTCGGGCC 
GAGGGGCTCA 
CTCTTGGA 



41 



51 
I 



ATAGCGGAGC GGATCCGATC 60 

TGOGAGAC CA TGG AGAACGG 120 

CTGTCTCATA CTAAGCACCG 180 

CTGACTGATA AATTAACTCA 240 

CGAAGTACAG TGCCAGGTCA 300 

TGTGTGATGA TGCAGGGCAG 360 

TTCCCAGTGA GGGTTTTCCA 420 

GGAGGGCTGA ACCCCAAGTT 480 

CTACCTGGTT TCAGTGGTCA 540 

CGTTTCCCTG CCATGTCTGA 600 

TGGAAACAAA TGGGGGAGCA 660 

CCCAGCTTTG AGACTGTGGC 720 

GGCATGAGTA CAGTACCAGA 780 

TTCTCACTCA TCACTAACAA 840 

GAAGAAGTCT TAGCAGCTGG 900 

CTTATGGCCA GCATTCCACT 960 

ATCTCCCACA CAAGACCCAA 1020 

CCTCTGTCCT TAGGTTGTAG 1080 

CTTCTACCAG ACCCTTCTGG 1140 

CATAGTGAGA CCTTGGCGCT 1200 

AGAGCTGGAG CCCGTGCCCT 1260 

TTAGAACTTT GCATAGCAGC 1320 

GTTCTGCCTT ATCTAAATCA 1380 



Seq ID No: 143 Protein sequence: 
Protein Accession #: NP 000261.1 



MENGYTYEDY 
VPGHAGRLVP 
NPKPEVGDIM 
MGEQRELQEG 
ITNKVIMDYE 



11 
I 

KNTAEWLLSH 
GFLNGRACVM 
LIRDHINLPG 
TYVMVAGPSF 
SLEXANHEEV 



21 



31 



41 



51 
I 



TKHRPQVAII CGSGIiGGLTD KLTQAQIFDY SEIPNFPRST 60 

MQGRFHMYEG YPLWKVTPPV RVPHLLGVDT LWTNAAGGL 120 

FSGQNPLRGP NDERFGDRFP AMSDAYDRTM RQRALSTWKQ 180 

ETVAECRVLQ KLGADAVGMS TVPEVIVARH CGIiRVFGFSL 240 
LAAGKQAAQK LEQFVSILMA SIPLPDKAS 



Seq ID NO: 144 DNA sequence 

Nucleic Acid Accession #: NM_015577.1 

Coding sequence: 112-3054 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GAAGCGGCGG GCGGGGTGGA GCAGCCAGCT 
GGGGTGTTGA AAAGTCTCCT CTAGAGCTTT 
TTGAAAGCGA AGTTCAGGAA GAGTGACACC 
CTGCAGGCCG TGGAGAATGG AGATGCGGAG 
GCCAGTGCCA CCAAACACGA CAGTGAGGGC 
GGACACGTGG AATGCCTCAG GGTCATGATT 
ACTACCGGAC ACAGCGCCTT ACATCTCGCA 
AGGCTGCTTC AGTCTAAATG CCCAGCCGAA 
CATTATGCAG CGGCTCAGGG CTGCCTTCAA 
CCCATAAACC TCAAAGATTT GGATGGGAAT 
CACAGTGAGA TCTGTCACTT TCTCCTGGAT 
AGTGGAAGAA CTGCTCTCAT GCTGGCCTGT 
TTAATTAAAA AGGGTGCAGA CCTAAACCTT 
TATTCCAAAC TCTCAGAAAA TGCAGGAATT 
GATGCTGATT TAAAGACCCC AACAAAACCA 



31 41 51 

I I I 

GGGTCCGGGG AGOGCCGCCG CCGCCTCGAT 60 

GGAAGGCTGA ATGCACTAAA C^GAAGAGC 120 

AATGAGTGGA ACAAGAATGA TGACOGGCTA 180 

AAGGTGGCCT CACTGCTCGG CAAGAAGGGG 240 

AAGACCGCTT TCCATCTTGC TGCTGCAAAA 300 

ACACATGGTG TGGATGTGAC AGCCCAAGAT 360 

GCCAAGAACA GCCACCATGA ATGCATCAGG 420 

AGTGTCGACA GCTCTGGGAA AACAGCTTTA 4 B0 

GCTGTGCAGA TTCTCTGCGA ACACAAGAGC 540 

ATACCGCTGC TTCTTGCTGT ACAAAATGGT 600 

CATGGAGCAG ATGTCAATTC CAGGAACAAA 660 

GAGATTGGCA GCTCTAACGC TGTGGAAGCC 720 

GTAGATTCTC TTGGATACAA TGCCTTACAT 780 

CAAAGCCTTC TATTATCAAA AATCTCTCAG 840 

AAGCAGCATG ACCAAGTCTC TAAAATAAGC 900 
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TCAGAAAGAA GTGGAACTCC AAAAACACGC AAAGCTCCAC CACCTCCTAT CAGTCCTACC 960 

CAGTTGAGTG ATGTCTCTTC CCCAAGATCA ATAACTTOGA CTCCACTATC GGGAAAGGAA 1020 

TOGGTATTTT TTGCTGAACC ACCCTTCAAG GCTGAGATCA GTTCTATACG AGAAAACAAA 1080 

GACAGACTAA GTGACAGTAC TACAGGTGCT GATAGCTTAT TGGATATAAG TTCTGAAGCT 1140 

5 GACCAACAAG ATCTTCTCTC TCTATTGCAA GCAAAAGTTG CTTCCCTTAC CTTACACAAT 1200 

AAGGAGTTAC AAGATAAATT ACAGGCCAAA TCACCCAAGG AGGCGGAAGC AGACCTAAGC 1260 

TTTGACTCAT ACCATTCCAC CCAAACTGAC TTGGGCCCAT CCCTGGGAAA ACCTGGTGAA 1320 

ACCTCTCCCC CAGACTCCAA ATCATCTCCA TCTGTCTTAA TACATTCTTT AGGTAAATCC 1360 

ACTACTGACA ATGATGTCAG AATTCAGCAA CTGCAAGAGA TTTTGCAAGA TCTACAGAAG 1440 

10 AGATTAGAGA GCTCTGAAGC AGAGAGAAAA CAGCTACAGG TCGAACTCCA ATCCCGAAGG 1500 

GCAGAACTGG TATGCTTAAA CAACACTGAG ATTTCAGAGA ACAGCTCTGA CCTCAGCCAG 1560 

AAACTTAAAG AAACTCAGAG CAAATACGAG GAGGCTATGA AAGAAGTCCT TAGTGTGCAG 1620 

AAGCAGATGA AACTCGGTCT TGTCTCACCT GAAAGCATGG ATAATTATTC ACATTTCCAC 1680 

GAGCTGAGGG TCACGGAAGA GGAAATAAAT GTGCTAAAGC AGGATCTGCA GAATGCATTA 1740 

15 GAAGAAAGTG AAAGAAATAA AGAGAAAGTG AGAGAGTTAG AGGAAAAACT GGTAGAGAGG 1800 

GAGAAAGGTA CAGTGATTAA GCCACCTGTG GAAGAGTACG AGGAAATGAA AAGTTCATAT 1860 

TGCTCTGTTA TTGAGAATAT GAATAAGGAG AAAGCATTTT TGTTTGAGAA ATACCAAGAA 1920 

GCCCAAGAAG AAATCATGAA ATTAAAAGAC ACACTAAAAA GTCAGATGAC ACAGGAAGCC 1980 

AGTGATGAAG CTGAGGACAT GAAAGAAGCC ATGAATAGGA TGATAGATGA ACTCAATAAA 2040 

20 CAGGTGAGCG AGCTGTCACA GCTGTACAAA GAAGCCCAGG CTGAGCTGGA GGATTACAGG 2100 

AAGAGGAAAT CTCTAGAGGA TGTCACAGCT GAATATATCC ATAAAGCAGA GCATGAGAAA 2160 

CTGATGCAAT TGACAAAOGT GTCCAGGGCT AAAGCAGAAG ATGCACTGTC TGAAATGAAG 2220 

TCTCAGTATT CAAAAGTGTT GAATGAGTTG ACCCAGCTCA AACAACTGGT GGATGCACAA 2280 

AAAGAGAACT CTGTCTCTAT CACAGAACAT TTGCAAGTGA TAACCACGCT GCGGACTGCA 2340 

25 GCAAAAGAGA TGGAAGAAAA AATAAGCAAT CTTAAGGAAC ACCTTGCAAG CAAGGAAGTG 2400 

GAAGTAGCAA AGCTGGAGAA ACAACTCTTA GAAGAGAAAG CTGCTATGAC TGATGCAATG 2460 

GTACCTCGGT CTTCCTATGA AAAACTCCAG TCATCCTTAG AGAGTGAAGT GAGTGTGTTG 2520 

GCATCGAAAT TAAAGGAATC TGTGAAAGAG AAAGAGAAGG TCCATTCAGA GGTTGTCCAG 2580 

ATTAGAAGTG AGGTCTCACA GGTGAAAAGA GAAAAGGAAA ATATTCAGAC TCTCTTGAAA 2640 

30 TCCAAAGAGC AAGAAGTAAA TGAACTTCTG CAAAAATTCC AGCAAGCTCA GGAAGAACTT 2700 

GCAGAAATGA AAAGATACGC TGAGAGCTCT TCAAAACTGG AGGAAGATAA AGATAAAAAG 2760 

ATAAATGAGA TGTCGAAGGA AGTCACCAAA TTGAAGGAGG CCTTGAACAG CCTCTCCCAG 2820 

CTCTCCTACT CAACAAGCTC ATCCAAAAGG CAGAGTCAGC AGCTGGAGGC GCTGCAGCAG 2880 

CAAGTCAAAC AGCTCCAGAA CCAGCTGGCG GAATGCAAGA AACAACACCA GGAGGTCATA 2940 

35 TCAGTTTACA GAATGCATCT TCTGTATGCT GTGCAGGGCC AGATGGATGA AGATGTCCAG 3000 

AAAGTACTGA AGCAAATCCT TACCATGTGT AAAAACCAGT CTCAAAAGAA GTAAAGTGGA 3060 

TTCCTTGGCA GGACACTGCC CCTTGTCATC TGTCTTTGTG TTAGATCCAG AGTTGTCGGC 3120 

AGCCGCTGCC ATTGTTCTCA TTCGTGGTAT GCACTGTGGC CTAGCGTAGC TTCTTCCCTT 3180 

TCCAAAGGTT TCTGAGGACT TCTCCCAGGA GAAGACTGCC CGCCTCAGAA CTGCTTAGAG 3240 

40 ACTTCAAACC AGCAGAGGTG AAAGTCCCTG TCATCCCTTC AGATTCCAGA GCTGGGATCA 3300 

GCCATGCCCA GAGGTCTGGT CCTGATGCTG GCAGGGGGGC CCCCTCCTCC ATCCCTGACT 3360 

GGCTGAGTGG CTTTATCACC ACCGAGTGAT GTGCTGAGGC CTCCTGCAGT GAATGCTCCT 3420 

TCCATTCCTG TACTCGGGCA GTGCCATTCA GCACAGGAGA GCTCTTTTTG CCTTTGGCTT 3480 

TCAATTCCAA AACATGATTT AATTTCTAAC TAAATTAGTA TGGCACTAGT TATGAAGTAT 3540 

45 CTGCTTAAAA CCCTTCATCA TGATATCCTG TGGATTTAAA AACTCTAATT CCATGTTTTC 3600 

TTCCCATCTG CCTTATATAT CTCATCACCC TGCTTATCAA TATTCAGTTT GATGAGCACT 3660 

ATTAACTAAA ATATGAAACT TAAAAACAAA AGCAAGTTGT CCTTAAAAGT TCTTTTTTTA 3720 

AGTAAATTGT TGACATACTG CAAATTTTCT ATGCAAACTT GCCTCCTGCT GTTATCTGTG 3780 

AAGCTCAGGA AATCCAAACA TTTGTGTTTC AACAAGGGAC AGTAAACTGT GTGTTTACAG 3840 

50 CCAAAAGAAA TGCCTCATAG TTCTTAACCT CAACTTTTGT AGAAGTATTT TTTTCTCTGT 3900 

AATATTTTTA TTGGCTCATA AAGATGTTTT CATATCTGAA CTCCTAAATA AGTGAAATTA 3960 

CAGTAGATTA TATTAACAAA ATACTTTTTA GGTAGCCATG CTTGAGACTT TTTAAAAATA 4020 

TAACTTTTTC CTTAAAGTTT TCAGCTATAG CAAAAGGTAG TTATGTATGC CAGACCTAAT 4 080 

ATGAGCTGCC . ACCAACACCC CTAGAACTTT CAGCCATGGT GTCTTCAGAA TTGTAGCGCA 4140 

55 TTTCTGAATC TAGCAAATCC TCCTTTTACC CGTTGAATGT TTTGAATGCC CTGACTCTAC 4200 

CAGCGCCCAT AAATGATCTC TAGAAGGACT GTTAGTACCA ATCTGTTTTT CAACTTTGAA 4260 

GCTAAAAACC CTGATATGGT AATATTATGG TGCATAGCAG AGGTCTCGGA AAAAAAATAT 4320 

TTCTGTTCAC TTTACTTTCA GGTTAAAAAT GTTTCTAACA CGCTTGCAAC TTCCCTTATG 4380 

GCATTAATCT TGTTGAGGGA GAGAGACAGA ATCCTGGACT CTCCAAAGTA TTTAACTGAA 4440 

60 AGTAGGGCCT GCTCTGACAG GGCCCATGTC CCACAAGGCT GCTTGGCCTC AGTGGGTGCT 4500 

TGGCTGTGCT GGATGATATG TTGATCTGTA TTGGATAAGG ACCAATGACA GCAAAGCAAA 4560 

AATGGCTTTA AAGCTTGGTG TTACTTTTCT TAAGTTGTTT AATTATAGTT AAGCAATTTC 4620 

AAAAATGCTC CAAAGAAATG TGAAAGGACC TTTTGTCACA GCACTTCAGA AAATACACAA 4680 

CAGCCCCTTC TGCCCCCGCA CAGAAATGCT GCAGAGTATA TAAAACTTGA GACATTTTTG 4740 

65 TAGGATGCCT GACGAGGTGT AGCCTTTTAT CTTGTTTCCG GATGCATATT TATTACGAGT 4800 

ACTCTGGTTA AATATTGAAA AGTTATATGC TGTAGTTTTT AGTATTTTGT CTTTGTAATT 4860 

TACAGAAGTT ATTGGAGAAA ATAAACTTGT TTCATTTTGC AAAAAAAAAA AAAAAAAAAA 4920 
AAAAA 

70 Seq ID No: 145 Protein sequence: 
Protein Accession #: NP 056392.1 



1 11 21 31 41 51 
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MKSLKAKPRK SDTNBWNKND DRLLQAVENG DAEKVASLLG KKGASATKHD SEGKTAFHLA 60 
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10 



15 



AAKGHVECLR 
TALHYAAAQG 
RNKSGRTALM 
ISQDADIiKTP 
GKBSVPFAEP 
LHNKELQDKL 
GKSTTDNDVR 
LSQKLKETQS 
NALEESERNK 
YQEAQEEIMK 
DYRKRKSLED 
DAQKENSVSI 
DAMVPRSSYE 
LLKSKEQEVN 
LSQLSYSTSS 
DVQKVLKQIL 



VMITHGVDVT 
CLQAVQILCB 
LACEIGSSNA 
TKPKQHDQVS 
PFKAEISSIR 
QAKSPKEAEA 
IQQLQEILQD 
KYEEAMKEVL 
EKVRELESKL 
LKDTLKSQMT 
VTAEYTHKAE 
TEHLQVITTL 
KLQSSLBSEV 
ELIiQKFQQAQ 
SKRQSQQLEA 
TMCKNQSQKK 



AQDTTGHSAL 
HKSPINLKDL 
VEAXJKKGAD 
KISSERSGTP 
ENKDRLSDST 
DLSPDSYHST 
LQKRLBSSEA 
SVQKQMKLGL 
VEREKGTVIK 
QEASDEAEDM 
HBKLMQLTNV 
RTAAKEKEEK 
SVIASKLKES 
EELAEMKRYA 
LQQQVKQXiQN 



HLAAKNSHHE 
DGNIPLLLAV 
LNLVDSLGYN 
KTRKAPPPPI 
TGADSLLDIS 
QTDLGPSLGK 
ERKQLQVBLQ 
VSPESMDNYS 
PPVEEYBEMK 
KBAMNRMIDE 
SRAKAEDALS 
ISNIiKEHLAS 
VKEKEKVHSE 
E5SSKLEEDK 
QLAECKKQHQ 



CIRRLLQSKC 
QNGH5EICHF 
ALHYSKLSEN 
SPTQIiSDVSS 
SEADQQDLLS 
PGETSPPDSK 
SRRAELVCUJ 
HPHELRVTBE 
SSYCSVIENM 
LNKQVSELSQ 
EMKSQYSKVL 
KEVEVAKLEK 
WQIRSEVSQ 
DKKINEKSKE 
EVISVYRMHL 



PAESVDSSGK 
LLDHGADVNS 
AGIQSLLLSK 
PRSITSTPLS 
LLQAKVASLT 
SSPSVLIHSI) 
NTEISENSSD 
EINVLKQDLQ 
NKEKAFLPEK 
LYKEAQABLE 
NBLTQLKQLV 
QLLEEKAAMT 
VKREKENIQT 
VTKLKEALNS 
LYAVQGQMDE 



120 
1B0 
240 
300 
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420 
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Seq ID NO: 146 DNA sequence 
Nucleic Acid Accession #: NM_0004S9.1 
20 Coding sequence: 149-3523 (underlined sequences correspond to start and stop codons) 
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CTTCTGTGCT 
TGGAAAGTCA 
GAAACTGGAT 
AGTCAGCTTG 
CCTACCTCTT 
CCATGAGCCC 
GCTGGAAGTT 
AAAGGCTAGT 
CAGGATACGA 
GACTGTGGAC 
AGATGCAGTG 
TGATATTCTA 
CAGGTATATA 
TGAAGCCCAG 
TGTCTGCCAT 
TGAGAAGGCT 
AGAGGGATGC 
AGGCTGGAAG 
TAAGCTTAGG 
CTCTCCAGGA 
GATAGTGGAT 
AGCTTCTGGC 
AGTGCTCCAT 
CCACCGGATC 
GATGGTGGAA 
AAACGTGATT 
TGGGGATGGA 
TTGGCAACAT 
AGAATATGAA 
ACCTGTGAGA 
CCTGCCTAAA 
AGATGACTTT 
TAAAGTTCCA 
CGTGGTCCGA 
TTGGACCCTT 
ACACTCCTCG 
TATCCGTTAC 
TGCCACCATC 
CATTTTTGCA 
CCTCCCAGAA 
CCTTGGCTCT* 
ATTGAAGAGG 
ACCAGCTGTG 
AGATCCTACA 
GGAGGGCAAT 
TGCTGCCATC 
AGAACTGGAA 
ATGTGAACAT 
GGACTTCCTT 
CACCGCGTCC 
CATGGACTAC 
AGTTGGTGAA 



GTTCCTTCTT 
CAAACCGCTG 
GGAGAGATTT 
CTCCTTTCTG 
GTATCTGATG 
ATCACCATAG 
ACTCAAGATG 
AAGATCAATG 
ACCATGAAGA 
AAGGGAGATA 
ATTTACAAAA 
GAAGTACACC 
GGAGGAAACC 
AAGTGGGGAC 
GAAGATACTG 
TGTGAACTGC 
AAGTCTTATG 
GGTCTGCAGT 
TGCAGCTGCA 
TGGCAGGGGC 
TTGCCAGATC 
TGGCCGCTAC 
CCAAAAGACT 
CTCCCCCCTG 
AAGCCCTTCA 
GACACTGGAC 
CCAATCAAAT 
ATTCAAGTGA 
CTCTGTGTGC 
CGCTTCACAA 
AGTCAGACCA 
TATGTTGAAG 
GGCAACTTGA 
GCTAGAGTCA 
AGTGACATTC 
GCTGTGATTT 
AAGGTTCAAG 
ATTCAGTATC 
GAGAACAACA 
TCTCAAGCAC 
GCTGGAATGA 
GCAAATGTGC 
CAGTTCAACT 
ATTTATCCAG 
TTTGGCCAAG 
AAAAGAATGA 
GTTCTTTGTA 
CGAGGCTACT 
CGCAAGAGCC 
ACACTGTCCT 
TTGAGCCAAA 
AACTATGTGG 



21 31 

I I 
GCCTCTAACT TGTAAACAAG 
GGTTTTTGAA AGGATCCTTG 
GGGGAAG CAT GG ACTCTTTA 
GAACTGTGGA AGGTGCCATG 
CTGAAACATC TCTCACCTGC 
GAAGGGACTT TGAAGCCTTA 
TGACCAGAGA ATGGGCTAAA 
GTGCTTATTT CTGTGAAGGG 
TGCGTCAACA AGCTTCCTTC 
ACGTGAACAT ATCTTTCAAA 
ATGGTTCCTT CATCCATTCA 
TGCCTCATGC TCAGCCCCAG 
TCTTCACCTC GGCCTTCACC 
CTGAATGCAA CCATCTCTGT 
GAGAATGCAT TTGCCCTCCT 
ACACGTTTGG CAGAACTTGT 
TGTTCTGTCT CCCTGACCCC 
GCAATGAAGC ATGCCACCCT 
ACAATGGGGA GATGTGTGAT 
TCCAGTGTGA GAGAGAAGGC 
ATATAGAAGT AAACAGTGGT 
CTACTAATGA AGAAATGACC 
TTAACCATAC GGATCATTTC 
ACTCAGGAGT TTGGGTCTGC 
ACATTTCTGT TAAAGTTCTT 
ATAACTTTGC TGTCATCAAC 
CCAAGAAGCT TCTATACAAA 
CAAATGAGAT TGTTACACTC 
AACTGGTCCG TCGTGGAGAG 
CAGCTTCTAT CGGACTCCCT 
CTCTAAATTT GACCTGGCAA 
TGGAGAGAAG GTCTGTGCAA 
CTTCGGTGCT ACTTAACAAC 
ACACCAAGGC CCAGGGGGAA 
TTCCTCCTCA ACCAGAAAAC 
CTTGGACAAT ATTGGATGGC 
GCAAGAATGA AGACCAGCAC 
AGCTCAAGGG CCTAGAGCCT 
TAGGGTCAAG CAACCCAGCC 
CAGCGGACCT CGGAGGGGGG 
CCTGCCTGAC TGTGCTGTTG 
AAAGGAGAAT GGCCCAAGCC 
CAGGGACTCT GGCCCTAAAC 
TGCTTGACTG GAATGACATC 
TTCTTAAGGC GCGCATCAAG 
AAGAATATGC CTCCAAAGAT 
AACTTGGACA CCATCCAAAC 
TGTACCTGGC CATTGAGTAC 
GTGTGCTGGA GAOGGACCCA 
CCCAGCAGCT CCTTCACTTC 
AACAGTTTAT CCACAGGGAT 
CAAAAATAGC AGATTTTGGA 



41 

I 

ACGTACTAGG 
GGACCTCATG 
GCCAGCTTAG 
GACTTGATCT 
ATTGCCTCTG 
ATGAACCAGC 
AAAGTTGTTT 
CGAGTTCGAG 
CTACCAGCTA 
AAGGTATTGA 
GTGCCCCGGC 
GATGCTGGAG 
AGGCTGATAG 
ACTGCTTGTA 
GGGTTTATGG 
AAAGAAAGGT 
TATGGGTGTT 
GGTTTTTACG 
CGCTTCCAAG 
ATACCGAGGA 
AAATTTAATC 
CTGGTGAAGC 
TCAGTAGCCA 
AGTGTGAACA 
CCAAAGCCCC 
ATCAGCTCTG 
CCCGTTAATC 
AACTATTTGG 
GGTGGGGAAG 
CCTCCAAGAG 
CCAATATTTC 
AAAAGTGATC 
TTACATCCCA 
TGGAGTGAAG 
ATCAAGATTT 
TATTCTATTT 
GTTGATGTGA 
GAAACAGCAT 
TTTTCTCATG 
AAGATGCTGC 
GCCTTTCTGA 
TTCCAAAACG 
AGGAAGGTCA 
AAATTTCAAG 
AAGGATGGGT 
GATCACAGGG 
ATCATCAATC 
GCGCCCCATG 
GCATTTGCCA 
GCTGCCGACG 
CTGGCTGCCA 
TTGTCCCGAG 



51 
I 

ACGATGCTAA 
CACATTTGTG 
TTCTCTGTGG 
TGATCAATTC 
GGTGGCGCCC 
ACCAGGATCC 
GGAAGAGAGA 
GAGAGGCAAT 
CTTTAACTAT 
TTAAAGAAGA 
ATGAAGTACC 
TGTACTCGGC 
TCCGGAGATG 
TGAACAATGG 
GAAGGACGTG 
GCAGTGGACA 
CCTGTGCCAC 
GGCCAGATTG 
GATGTCTCTG 
TGACCCCAAA 
CCATTTGCAA 
CGGATGGGAC 
TATTCACCAT 
CAGTGGCTGG 
TGAATGCCCC 
AGCCTTACTT 
ACTATGAGGC 
AACCTCGGAC 
GGCATCCTGG 
GTCTAAATCT 
CAAGCTCGGA 
AGCAGAATAT 
GGGAGCAGTA 
ATCTCACTGC 
CCAACATTAC 
CTTCTATTAC 
AGATAAAGAA 
ACCAGGTGGA 
AACTGGTGAC 
TTATAGCCAT 
TCATATTGCA 
TGAGGGAAGA 
AAAACAACCC 
ATGTGATTGG 
TACGGATGGA 
ACTTTGCAGG 
TCTTAGGAGC 
GAAACCTTCT 
TTGCCAATAG 
TGGCCCGGGG 
GAAACATTTT 
GTCAAGAGGT 



60 
120 
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3120 
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10 



15 



GTACGTGAAA 
TTACAGTGTG 
TGTTAGCTTA 
GCCCCAGGGC 
GAGACAATGC 
CTTAAACAGA 
TACTTATGCA 
CCCTCTGTTT 
CTGCCAAAGG 
TAATATTTAA 
TCCCTCACCT 
TTTCCACAGC 
TTGCTTACAA 
ATTTTCTTTT 
TGGGTGACAT 
GCACATTGTA 
CACTTTGCAC 



AAGACAATGG 
TACACAACCA 
GGAGGCACAC 
TACAGACTGG 
TGGCGGGAGA 
ATGTTAGAGG 
GGAATTGACT 
CCCTTTCACT 
ATGTGATATA 
GACACTGAAA 
GTAGCATGCC 
CTGCAAGTTC 
GCCTAAGAAT 
CTTTTCTCTG 
TTGGGAGACA 
AAAAGTTTTA 
TGATATATCA 



GAAGGCTCCC 
ACAGTGATGT 
CCTACTGCGG 
AGAAGCCCCT 
AGCCTTATGA 
AGCGAAAGAC 
GTTCTGCTGA 
GGCATGGGAG 
TAAGTGTACA 
AATCTAAGTG 
AGTCCCGTTT 
AGTCCAGGAT 
CTTTAGAGAA 
GTAATATTGA 
TGTGACATTT 
GTTTTGATGA 
TGAGTGAATA 



AGTGCGCTGG 
ATGGTCCTAT 
GATGACTTGT 
GAACTGTGAT 
GAGGCCATCA 
CTACGTGAAT 
AGAAGCGGCC 
ACCCTTGACA 
TATGTGCTGG 
ATATAAATCA 
CATTTAGTCA 
GCTAACATCT 
GTATACATAA 
CTTGTATATT 
ATATATTGAA 
GTTGTGAGTT 
AATGTCTTGC 



ATGGCCATCG 
GGTGTGTTAC 
GCAGAACTCT 
GATGAGGTGT 
TTTGCCCAGA 
ACCACGCTTT 
TAGGACAGAA 
ACTGCTGAGA 
AATTCTAACA 
GATTCTTCTC 
TGTGACCACT 
AAAAATAGAC 
GTTTAGGATA 
TTAAGAAATA 
TTAATATCCC 
TACCTTGTAT 
CTACTCAAAA 



AGTCACTGAA 
TATGGGAGAT 
ACGAGAAGCT 
ATGATCTAAT 
TATTGGTGTC 
ATGAGAAGTT 
CATCTGTATA 
AAACATGCCT 
AGTCATAGGT 
TCTCATTTTA 
CTGTCTTGTG 
TTAAATCTCA 
AAATAATGGG 
ACAGAAAGCC 
TACATGTATT 
ACTGTAGGCA 
AAAAAAAA 



3180 
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20 



Seq ID No: 147 Protein _ sequence : 
Protein Accession #: NP 000450.1 
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MDSLASLVLC 
FEALMNQHQD 
QASFLPATLT 
AQPQDAGVYS 
ICPPGFMGRT 
ACHPGFYGPD 
VNSGKFNPIC 
VWVCSVNTVA 
LLYKPVNHYB 
IGLPPPRGLN 
LLNNLHPREQ 
ILDGYSISSI 
SNPAFSHELV 
MAQAFQNVRE 
ARIKKDGLRM 
AIEYAPHGNL 
IHRDLAARNI 
VWSYGVLLWE 
ERPSFAQILV 



11 
I 

GVSLLLSGTV 
PLEVTQDVTR 
MTVDKGDNVN 
ARYIGGNLFT 
CEKACELHTF 
CKIiRCSCNNG 
KASGWPLPTN 
GMVEKPFNIS 
AWQHIQVTNB 
LLPKSQTTLN 
YWRARVNTK 
TIRYKVQGKN 
TLPESQAPAD 
BPAVQFNSGT 
DAAIKRMKEY 
LDFLRKSRVL 
LVGENYVAKI 
IVSLGGTPYC 
SLNRKLEERK 



21 

I 

EGAMDLILIN 
EWAKKWWKR 
ISFKKVLIKE 
SAFTRLIVRR 
GRTCKERCSG 
EMCDRFQGCL 
EEMTLVKPDG 
VKVLPKPLNA 
Ivn^IYLEPR 
LTWQPIFPSS 
AQGEWSEDLT 
EDQHVDVKIK 
LGGGKMLLIA 
LALNRKVKNN 
ASKDDHRDFA 
ETDPAFAIAN 
ADFGLSRGQB 
GMTCAELYEK 
TYVNTTLYEK 



31 

I 

SLPLVSDAET 
EKASKINGAY 
EDAVIYKNGS 
CEAQKWGPEC 
QEGCKSYVFC 
CSPGWQGIiQC 
TVLHPKDFNH 
PNVIDTGHNF 
TEYELCVQLV 
EDDFYVEVER 
AWTLSDILPP 
NATIIQYQLK 
ILGSAGMTCL 
PDPTIYPVLD 
GELEVLCKLG 
STASTLSSQQ 
VYVKKTKGRL 
hPQGYRLEKP 
FTYAGIDCSA 



41 
I 

SLTCIASGWR 
FCEGRVRGEA 
FIHSVPRHBV 
NHLCTACMNN 
LPDPYGCSCA 
EREGIPRMTP 
TDHFSVAIFT 
AVINISSEPY 
RRGEGGEGHP 
RSVQKSDQQN 
QPENIKISWI 
GLEPETAYQV 
TVLLAFLIIL 
WNDIKFQDVI 
HHPNIINLLG 
LLHFAADVAR 
PVRWMAIESL 
LNCDDEVYDL 
EEAA 



51 
I 

PHEPITIGRD 
IRIRTMKMRQ 
PDILEVHLPH 
GVCHEDTGEC 
TGWKGLQCNE 
KIVDLPDHIE 
IHRILPPDSG 
FGDGPIKSKK 
GPVRRFTTAS 
IKVPGNLTSV 
THSSAVISWT 
DIFAENNTGS 
QLKRANVQRR 
GEGNFGQVLK 
ACEHRGYLYli 
GMDYLSQKQF 
NYSVYTTNSD 
MRQCWREKPY 
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Seq ID NO: 148 DNA sequence 

Nucleic Acid Accession # ; NM_000552 . 2 

Coding sequence: 311-8752 (underlined sequences correspond to start and stop codons) 
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AGCTCACAGC 
TATCTCCCCC 
GAGCTGTAGC 
GCACGATTGT 
GCAGGGGAAG 
GCCCTCATTT 
GCCAGGGACC 
TTTCGGAAGT 
CAGTTACCTC 
GAATGGCAAG 
TGTCAATGGT 
GCTGTATCTA 
GGCCAGGATC 
GACCTGCGGG 
AGGGACCTTG 
ACAGTGGTGT 
GCAGAAGGGC 
CCACCCTCTG 
TGCTGGGGGG 
GGAGGGAATG 
-TGGTATGGAG 
CAATGAAATG 
GGATGAAGGC 
CCCTCCCGGC 
GATCTGCAGC 



11 
I 

TATTGTGGTG 
AGCAGTGGGG 
AGACCTGATT 
CCAGCAGCTG 
GCACCATTGT 
ATGATTCCTG 
CTTTGTGCAG 
GACTTCGTCA 
CTGGCAGGGG 
AGAGTGAGCC 
ACCGTGACAC 
GAAACTGAGG 
GATGGCAGCG 
CTGTGTGGCA 
ACCTCGGACC 
GAACGGGCAT 
CTGTGGGAGC 
GTGGACCCCG 
CTGGAGTGCG 
GTGCTGTACG 
TATAGGCAGT 
TGTCAGGAGC 
CTCTGCGTGG 
ACCTCCCTCT 
AATGAAGAAT 



21 
I 

GGAAAGGGAG 
ACTCCACAGC 
GAGCCTTTGC 
AGTTTCCCAG 
CCAGCAGCTG 
CCAGATTTGC 
AAGGAACTCG 
ACACCTTTGA 
GCTGCCAGAA 
TCTCCGTGTA 
AGGGGGACCA 
CTGGGTACTA 
GCAACTTTCA 
ACTTTAACAT 
CTTATGACTT 
CTCCTCCCAG 
AGTGCCAGCT 
AGCCTTTTGT 
CCTGCCCTGC 
GCTGGACCGA 
GTGTGTCCCC 
GATGCGTGGA 
AGAGCACOGA 
CTCGAGACTG 
GTCCAGGGGA 



31 

I 

GGTGGTTGGT 
CCCTGGGCTA 
AGCAGCTGAG 
GGACCTTGGA 
AGTTTCCCAG 
CGGGGTGCTG 
CGGCAGGTCA 
TGGGAGCATG 
ACGCTCCTTC 
TCTTGGGGAA 
AAGAGTCTCC 
CAAGCTGTCC 
AGTCCTGCTG 
CTTTGCTGAA 
TGCCAACTCA 
CAGCTCATGC 
TCTGAAGAGC 
GGCCCTGTGT 
CCTCCTGGAG 
CCACAGCGCG 
TTGCGCCAGG 
TGGCTGCAGC 
GTGTCOCTGC 
CAACACCTGC 
GTGCCTTGTC 



41 

I 

GGATGTCACA 
CATAACAGCA 
AGCATGGCCT 
GATAGCCGCA 
GGACCTTGGA 
CTTGCTCTGG 
TCCACGGCCC 
TACAGCTTTG 
TCGATTATTG 
TTTTTTGACA 
ATGCCCTATG 
GGTGAGGCCT 
TCAGACAGAT 
GATGACTTTA 
TGGGCTCTGA 
AACATCTCCT 
ACCTCGGTGT 
GAGAAGACTT 
TACGCCCGGA 
TGCAGCCCAG 
ACCTGCCAGA 
TGCCCTGAGG 
GTGCATTCCG 
ATTTGCCGAA 
ACTGGTCAAT 



51 
I 

GCTTGGGCTT 
AGACAGTCCG 
AGGGTGGGCG 
GCCCTCATTT 
GATAGCCGCA 
CCCTCATTTT 
GATGCAGCCT 
CGGGATACTG 
GGGACTTCCA 
TCCATTTGTT 
CCTCCAAAGG 
ATGGCTTTGT 
ACTTCAACAA 
TGACCCAAGA 
GCAGTGGAGA 
CTGGGGAAAT 
TTGCCCGCTG 
TGTGTGAGTG 
CCTGTGCCCA 
TGTGCCCTGC 
GCCTGCACAT 
GACAGCTCCT 
GAAAGCGCTA 
ACAGCCAGTG 
CCCACTTCAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
84 0 
900 
960 
1020 
1080 
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1200 
1260 
1320 
1380 
1440 
1S00 



250 



WO 02/079492 



PCT/US02/04915 



GAGCTTTGAC AACAGATACT TCACCTTCAG 
TTGCCAGGAC CACTCCTTCT CCATTGTCAT 
CGCTGTGTGC ACCCGCTCOG TCACCGTCCG 
ACTGAAGCAT GGGGCAGGAG TTGCCATGGA 
5 AGGTGACCTC CGCATCCAGC ATACAGTGAC 
CCTGCAGATG GACTGGGATG GCCGCGGGAG 
CGGGAAGACC TGCGGCCTGT GTGGGAATTA 
CCCCTCTGGG CTGGCAGAGC CCCGGGTGGA 
GGACTGCCAG GACCTGCAGA AGCAGCACAG 

10 CAGGTTCTCC GAGGAGGCGT GCGCGGTCCT 
TGCCGTCAGC CCGCTGCCCT ACCTGCGGAA 
CGGCCGCGAG TGCCTGTGCG GCGCCCTGGC 
CGTGCGCGTC GCGTGGCGCG AGCCAGGCCG 
GTACCTGCAG TGCGGGACCC CCTGCAACCT 

15 GGAATGCAAT GAGGCCTGCC TGGAGGGCTG 
GAGGGGGGAC TGCGTGCCCA AGGCCCAGTG 
GCCAGAAGAC ATCTTCTCAG ACCATCACAC 
CTGTACCATG AGTGGAGTCC CCGGAAGCTT 
GTCTCATCGC AGCAAAAGGA GCCTATCCTG 

20 OGCTGACAAC CTGCGGGCTG AAGGGCTCGA 
GGAGTGCATG AGCATGGGCT GTGTCTCTGG 
GGAGTGCATG AGCATGGGCT GTGTCTCTGG 
TGAGAACAGA TGTGTGGCCC TGGAAAGGTG 
CCCTGGAGAA ACAGTGAAGA TTGGCTGCAA 

25 CTGCACAGAC CATGTGTGTG ATGCCACGTG 
CTTCGACGGG CTCAAATACC TGTTCCCOGG 
CTGCGGCAGT AACCCTGGGA CCTTTCGGAT 
CTCAGTGAAA TGCAAGAAAC GGGTCACCAT 
TGACGGGGAG GTGAATGTGA AGAGGCCCAT 

30 GTCTGGCCGG TACATCATTC TGCTGCTGGG 
CCTGAGCATC TCCGTGGTCC TGAAGCAGAC 
GAATTTTGAT GGCATCCAGA ACAATGACCT 
CCCTGTGGAC TTTGGGAACT CCTGGAAAGT 
GCCTCTGGAC TCATCCCCTG CCACCTGCCA 

35 TTCCTCCTGT AGAATCCTTA CCAGTGACGT 
CGAGCCATAT CTGGATGTCT GCATTTACGA 
CGCCTGCTTC TGCGACACCA TTGCTGCCTA 
GGTGACCTGG AGGACGGCCA CATTGTGCCC 
GAACGGGTAT GAGTGTGAGT GGCGCTATAA 

40 TCAGCACCCT GAGCCACTGG CCTGCCCTGT 
CCCTCCAGGG AAAATCCTGG ATGAGCTTTT 
AGTGTGTGAG GTGGCTGGCC * GGCGTTTTGC 
TGACCCTGAG CACTGCCAGA TTTGCCACTG 
CCAGGAGCCG GGAGGCCTGG TGGTGCCTCC 

45 GTATGTGGAG GACATCTCGG AACCGCCGTT 
CCTGGTCTTC CTGCTGGATG GCTCCTCCAG 
GGCCTTTGTG GTGGACATGA TGGAGCGGCT 
CGTGGTGGAG TACCACGAOG GCTCCCACGC 
GTCAGAGCTG CGGCGCATTG CCAGCCAGGT 

50 CAGCGAGGTC TTGAAATACA CACTGTTCCA 
CTCCCGCATC GCCCTGCTCC TGATGGCCAG 
TGTCCGCTAC GTCCAGGGCC TGAAGAAGAA 
GCCCCATGCC AACCTCAAGC AGATCCGCCT 
CTTCGTGCTG AGCAGTGTGG ATGAGCTGGA 

55 CTGTGACCTT GCCCCTGAAG CCCCTCCTCC 
TGTGGGCCCG GGGCTCTTGG GGGTTTCGAC 
GGATGTGGCG TTOGTCCTGG AAGGATCGGA 
CAAGGAGTTC ATGGAGGAGG TGATTCAGCG 
CACGGTGCTG CAGTACTCCT ACATGGTGAC 

60 CAAAGGGGAC ATCCTGCAGC GGGTGCGAGA 
CACTGGGCTG GCCCTGCGGT ACCTCTCTGA 
GGAGCAGGCG CCCAACCTGG TCTACATGGT 
GAGGCTGCCT GGAGACATCC AGGTGGTGCC 
GGAGCTGGAG AGGATTGGCT GGCCCAATGC 

65 CCCCCGAGAG GCTCCTGACC TGGTGCTGCA 
CCCCACCCTC TCCCCTGCAC CTGACTGCAG 
TGGCTCCTCC AGTTTCCCAG CTTCTTATTT 
CATTTCAAAA GCCAATATAG GGCCTCGTCT 
CATCACCACC ATTGACGTGC CATGGAACGT 

70 TGTGGACGTC ATGCAGCGGG AGGGAGGCCC 
TGTGCGATAC TTGACTTCAG AAATGCATGG 
CATCCTGGTC ACGGACGTCT CTGTGGATTC 
CAACAGAGTG ACAGTGTTCC CTATTGGAAT 
GATCTTGGCA GGCCCAGCAG GCGACTCCAA 

75 CCCTACCATG GTCACCTTGG GCAATTCCTT 
GATTTGCATG GATGAGGATG GGAATGAGAA 



TGGGATCTGC CAGTACCTGC TGGCCCGGGA 1560 

TGAGACTGTC CAGTGTGCTG ATGACCGCGA 1620 

GCTGCCTGGC CTGCACAACA GCCTTGTGAA 1680 

TGGCCAGGAC ATCCAGCTCC CCCTCCTGAA 1740 

GGCCTCCGTG CGCCTCAGCT ACGGGGAGGA 1800 

GCTGCTGGTG AAGCTGTCCC CCGTCTACGC 1860 

CAATGGCAAC CAGGGCGACG ACTTCCTTAC 1920 

GGACTTCGGG AACGCCTGGA AGCTGCACGG 1980 

CGATCCCTGC GCCCTCAACC CGCGCATGAC 2040 

GACGTCCCCC ACATTCGAGG CCTGCCATCG 2100 

CTGCCGCTAC GACGTGTGCT CCTGCTCGGA 2160 

CAGCTATGCC GCGGCCTGCG CGGGGAGAGG 2220 

CTGTGAGCTG AACTGCCCGA AAGGCCAGGT 2280 

GACCTGCCGC TCTCTCTCTT ACCCGGATGA 2340 

CTTCTGCCCC CCAGGGCTCT ACATGGATGA 2400 

CCCCTGTTAC TATGACGGTG AGATCTTCCA 2460 

CATGTGCTAC TGTGAGGATG GCTTCATGCA 2520 

GCTGCCTGAC GCTGTCCTCA GCAGTCCCCT 2580 

TCGGCCCCCC ATGGTCAAGC TGGTGTGTCC 2640 

GTGTACCAAA ACGTGCCAGA ACTATGACCT 2700 

CTGCCTCTGC CCCCCGGGCA TGGTCCGGCA 2760 

CTGCCTCTGC CCCCCGGGCA TGGTCCGGCA 2760 

TCCCTGCTTC CATCAGGGCA AGGAGTATGC 2820 

CACTTGTGTC TGTCGGGACC GGAAGTGGAA 2880 

CTCCACGATC GGCATGGCCC ACTACCTCAC 2940 

GGACTGCCAG TACGTTCTGG TGCAGGATTA 3000 

CCTAGTGGGG AATAAGGGAT GCAGCCACCC 3060 

CGTGGTGGAG GGAGGAGAGA TTGAGCTGTT 3120 

GAAGGATGAG ACTCACTTTG AGGTGGTGGA 3180 

CAAAGCCCTC TCCGTGGTCT GGGACCGCCA 3240 

ATACCAGGAG AAAGTGTGTG GCCTGTGTGG 3300 

CACCAGCAGC AACCTCCAAG TGGAGGAAGA 3360 

GAGCTCGCAG TGTGCTGACA CCAGAAAAGT 3420 

TAACAACATC ATGAAGCAGA CGATGGTGGA 3480 

CTTCCAGGAC TGCAACAAGC TGGTGGACCC 3540 

CACCTGCTCC TGTGAGTCCA TTGGGGACTG 3600 

TGCCCAOGTG TGTGCCCAGC ATGGCAAGGT 3660 

CCAGAGCTGC GAGGAGAGGA ATCTCCGGGA 3720 

CAGCTGTGCA CCTGCCTGTC AAGTCACGTG 3780 

GCAGTGTGTG GAGGGCTGCC ATGCCCACTG 3840 

GCAGACCTGC GTTGACCCTG AAGACTGTCC 3900 

CTCAGGAAAG AAAGTCACCT TGAATCCCAG 3960 

TGATGTTGTC AACCTCACCT GTGAAGCCTG 4020 

CACAGATGCC CCGGTGAGCC CCACCACTCT 4080 

GCACGATTTC TACTGCAGCA GGCTACTGGA 4140 

GCTGTCCGAG GCTGAGTTTG AAGTGCTGAA 4200 

GCGCATCTCC CAGAAGTGGG TCCGCGTGGC 4260 

CTACATCGGG CTCAAGGACC GGAAGCGACC 4320 

GAAGTATGCG GGCAGCCAGG TGGCCTCCAC 4380 

AATCTTCAGC AAGATCGACC GCCCTGAAGC 4440 

CCAGGAGCCC CAACGGATGT CCCGGAACTT 4500 

GAAGGTCATT GTGATCCCGG TGGGCATTGG 4560 

CATCGAGAAG . CAGGCCCCTG AGAACAAGGC 4620 

GCAGCAAAGG GACGAGATCG TTAGCTACCT 4680 

TACTCTGCCC CCCCACATGG CACAAGTCAC 4740 

CCTGGGGCCC AAGAGGAACT CCATGGTTCT 4800 

CAAAATTGGT GAAGCCGACT TCAACAGGAG 4860 

GATGGATGTG GGCCAGGACA GCATCCACGT 4920 

CGTGGAGTAC CCCTTCAGCG AGGCACAGTC 4980 

GATCCGCTAC CAGGGCGGCA ACAGGACCAA 5040 

CCACAGCTTC TTGGTCAGCC AGGGTGACCG 5100 

CACCGGAAAT CCTGCCTCTG ATGAGATCAA 5160 

CATTGGAGTG GGCCCTAATG CCAACGTGCA 5220 

CCCTATCCTC ATCCAGGACT TTGAGACGCT 5280 

GAGGTGCTGC TCCGGAGAGG GGCTGCAGAT 5340 

CCAGCCCCTG GACGTGATCC TTCTCCTGGA 5400 

TGATGAAATG AAGAGTTTCG CCAAGGCTTT 5460 

CACTCAGGTG TCAGTGCTGC AGTATGGAAG 5520 

GGTCCCGGAG AAAGCCCATT TGCTGAGCCT 5580 

CAGCCAAATC GGGGATGCCT TGGGCTTTGC 5640 

TGCCAGGCCG GGAGCCTCAA AGGCGGTGGT 5700 

AGTGGATGCA GCAGCTGATG CCGCCAGGTC 5760 

TGGAGATCGC TACGATGCAG CCCAGCTACG 5820 

CGTGGTGAAG CTCCAGCGAA TCGAAGACCT 5880 

CCTCCACAAA CTGTGCTCTG GATTTGTTAG 5940 

GAGGCCCGGG GACGTCTGGA CCTTGCCAGA 6000 



251 
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CCAGTGCCAC ACCGTGACTT GCCAGCCAGA 
CAACTGTGAC CGGGGGCTGA GGCCTTCGTG 
AGAGACCTGT GGCTGCCGCT GGACCTGCCC 
CATCGTGACC TTTGATGGGC AGAATTTCAA 
TCAAAACAAG GAGCAGGACC TGGAGGTGAT 
AAGGCAGGGC TGCATGAAAT CCATCGAGGT 
CAGTGACATG GAGGTGACGG TGAATGGGAG 
CATGGAAGTC AACGTTTATG GTGCCATCAT 
CATCTTCACA TTCACTCCAC AAAACAATGA 
TGCTTCAAAG ACGTATGGTC TGTGTGGGAT 
GCTGAGGGAT GGCACAGTCA CCACAGACTG 
GCGGCCAGGG CAGACGTGCC AGCCCATCCT 
CCACTGCCAG GTCCTCCTCT TACCACTGTT 
CACATTCTAT GCCATCTGCC AGCAGGACAG 
CGCCTCTTAT GCCCACCTCT GTCGGACCAA 
TTTCTGTGCT ATGTCATGCC CACCATCTCT 
CCGGCACTGT GATGGCAACG TGAGCTCCTG 
CCCTCCAGAT AAAGTCATGT TGGAAGGCAG 
CATTGGTGAG GATGGAGTCC AGCACCAGTT 
CTGTCAGATC TGCACATGCC TCAGCGGGCG 
CACGGCCAAA GCTCCCACGT GTGGCCTGTG 
CCAGTGCTGC CCCGAGTATG AGTGTGTGTG 
GCCTCACTGT GAACGTGGCC TCCAGCCCAC 
CTTCACCTGC GCCTGCAGGA AGGAGGAGTG 
GCACCGTTTG CCCACCCTTC GGAAGACCCA 
CTGTGTCAAC TCCACAGTGA GCTGTCCCCT 
CTGTGGCTGT ACCACAACCA CCTGCCTTCC 
CTACCCTGTG GGCCAGTTCT GGGAGGAGGG 
GGATGCCGTG ATGGGCCTCC GCGTGGCCCA 
TCGGTCGGGC TTCACTTACG TTCTGCATGA 
TGCCTGTGAG GTGGTGACTG GCTCACCGCG 
CGGCTCCCAG TGGGCCTCCC CGGAGAACCC 
GGAGGAGGTC TTTATACAAC AAAGGAACGT 
CCCCTCGGGC TTTCAGCTGA GCTGTAAGAC 
GCGCATGGAG GCCTGCATGC TCAATGGCAC 
OGATGTGTGC ACGACCTGCC GCTGCATGGT 
GGAGTGCAGG AAGACCACCT GCAACCCCTG 
AGGTGAATGT TGTGGGAGAT GTTTGCCTAC 
GATCATGACA CTGAAGCGTG ATGAGACGCT 
GGTCAATGAG AGAGGAGAGT ACTTCTGGGA 
TGAACACAAG TGTCTGGCTG AGGGAGGTAA 
CACATGTGAG GAGCCTGAGT GCAACGACAT 
AAGCTGTAAG TCTGAAGTAG AGGTGGATAT 
AGCCATGTAC TCCATTGACA TCAACGATGT 
ACGGACGGAG CCCATGCAGG TGGCCCTGCA 
GGTTCTCAAT GCCATGGAGT GCAAATGCTC 
CAGCTGCATG GGTGCCTGCT GCTGCCTGCC 
AGTCCTCTGC ATGTTCTGCT CTTGTGCCCT 
TCTTGCTGCA TGTTCTGCTC TTGTGCCCTT 



TGGCCAGACC TTGCTGAAGA GTCATCGGGT 6060 

CCCTAACAGC CAGTCCCCTG TTAAAGTGGA 6120 

CTGCGTGTGC ACAGGCAGCT CCACTCGGCA 6180 

GCTGACTGGC AGCTGTTCTT ATGTCCTATT 6240 

TCTCCATAAT GGTGCCTGCA GCCCTGGAGC 6300 

GAAGCACAGT GCCCTCTCCG TCGAGCTGCA 6360 

ACTGGTCTCT GTTCCTTACG TGGGTGGGAA 6420 

GCATGAGGTC AGATTCAATC ACCTTGGTCA 6480 

GTTCCAACTG CAGCTCAGCC CCAAGACTTT 6540 

CTGTGATGAG AACGGAGCCA ATGACTTCAT 6600 

GAAAACACTT GTTCAGGAAT GGACTGTGCA 6660 

GGAGGAGCAG TGTCTTGTCC CCGACAGCTC 6720 

TGCTGAATGC CACAAGGTCC TGGCTCCAGC 6780 

TTGCCACCAG GAGCAAGTGT GTGAGGTGAT 6840 

CGGGGTCTGC GTTGACTGGA GGACACCTGA 6900 

GGTCTACAAC CACTGTGAGC ATGGCTGTCC 6960 

TGGGGACCAT CCCTCCGAAG GCTGTTTCTG 7020 

CTGTGTCCCT GAAGAGGCCT GCACTCAGTG 7080 

CCTGGAAGCC TGGGTCCCGG ACCACCAGCC 7140 

GAAGGTCAAC TGCACAACGC AGCCCTGCCC 7200 

TGAAGTAGCC CGCCTCCGCC AGAATGCAGA 7260 

TGACCCAGTG AGCTGTGACC TGCCCCCAGT 7320 

ACTGACCAAC CCTGGCGAGT GCAGACCCAA 7380 

CAAAAGAGTG TCCCCACCCT CCTGCCCCCC 7440 

GTGCTGTGAT GAGTATGAGT GTGCCTGCAA 7500 

TGGGTACTTG GCCTCAACCG CCACCAATGA 7560 

CGACAAGGTG TGTGTCCACC GAAGCACCAT 7620 

CTGCGATGTG TGCACCTGCA CCGACATGGA 7680 

GTGCTCCCAG AAGCCCTGTG AGGACAGCTG 7740 

AGGCGAGTGC TGTGGAAGGT GCCTGCCATC 7800 

GGGGGACTCC CAGTCTTCCT GGAAGAGTGT 7860 

CTGCCTCATC AATGAGTGTG TCCGAGTGAA 7920 

CTCCTGCCCC CAGCTGGAGG TCCCTGTCTG 7980 

CTCAGCGTGC TGCCCAAGCT GTCGCTGTGA 8040 

TGTCATTGGG CCCGGGAAGA CTGTGATGAT 8100 

GCAGGTGGGG GTCATCTCTG GATTCAAGCT 8160 

CCCCCTGGGT TACAAGGAAG AAAATAACAC 8220 

GGCTTGCACC ATTCAGCTAA GAGGAGGACA 8280 

CCAGGATGGC TGTGATACTC ACTTCTGCAA 8340 

GAAGAGGGTC ACAGGCTGCC CACCCTTTGA 8400 

AATTATGAAA ATTCCAGGCA CCTGCTGTGA 8460 

CACTGCCAGG CTGCAGTATG TCAAGGTGGG 8520 

CCACTACTGC CAGGGCAAAT GTGCCAGCAA 8580 

GCAGGACCAG TGCTCCTGCT GCTCTCCGAC 8640 

CTGCACCAAT GGCTCTGTTG TGTACCATGA 8700 

CCCCAGGAAG TGCAGCAAG T GA GGCTGCTG 8760 

TTGGCCTGAT GGCCAGGCCA GAGTGCTGCC 8820 

TCTGAGCCCA CAATAAAGGC TGAGCTCTTA 8880 
CTGAGCCCAC AAT 



Seq ID No: 149 Protein sequence; 
Protein Accession #: NP 000543.1 



1 11 21 31 41 51 

I I I I I I 

MIPARFAGVL LALALILPGT LCAEGTRGRS STARCSLFGS DFVNTFDGSM YSFAGYCSYL 60 

ItAGGCQKRSF SIIGDFQNGK RVSLSVYLGE FFDIHLFVNG TVTQGDQRVS MPYASKGLYL 120 

ETEAGYYKLS GEAYGFVARI DGSGNFQVLL SDRYFNKTCG LCGNFNIFAE DDFMTQEGTL 180 

TSDPYDFANS WAI*SSGEQWC ERASPPSSSC NISSGEMQKG LWEQCQLLKS TSVFARCHPL 240 

VDPEPFVALC EKTLCECAGG LECACPALLE YARTCAQEGM VLYGWTDHSA CSPVCPAGME 300 

YRQCVSPCAR TCQSLHINEM CQERCVDGCS CPEGQLLDBG LCVESTECPC VHSGKRYPPG 360 

TSLSRDCNTC ICRNSQWICS NEECPGECLV TGQSHFKSFD NRYFTFSGIC QYLLARDCQD 420 

HSFSIVIETV QCADDRDAVC TRSVTVRIiPG LHNSLVKLKH GAGVAMDGQD IQLPLLKGDL 480 

RIQHTVTASV RLSYGEDLQM DWDGRGRLLV KLSPVYAGKT CGLCGNYNGN QGDDFLTPSG 540 

IiABPRVEDFG NAWKLHGDCQ DLQKQRSDPC ALNPRMTRFS EEACAVLTSP TFEACHRAVS 600 

PLPYLRNCRY DVCSCSDGRE CLCGALASYA AACAGRGVRV AWREPGRCEL NCPKGQVYLQ 660 

CGTPCNLTCR SLSYPDEECN EACIiEGCFCP PGLYMDERGD CVPKAQCPCY YDGEIFQPED 720 

IFSDHHTMCY CEDGFMHCTM SGVPGSLLPD AVLSSPLSHR SKRSLSCRPP MVKLVCPADN 780 

LRAEGLECTK TCQNYDLECM SMGCVSGCLC PPGMVRHEKR CVALERCPCF HQGKEYAPGE 840 

TVKIGCNTCV CRDRKWNCTD HVCDATCSTI GMAHYLTFDG LKYLFPGECQ YVLVQDYCGS 900 

NPGTFRILVG NKGCSHPSVK CKKRVTILVE GGEIELFDGE VNVKRPMKDE THFEWESGR 960 

YIILLLGKAL SWWDRHLSI SWLKQTYQE KVCGIiCGNFD GIQNNDLTSS NLQVEEDPVD 1020 

FGNSWKVSSQ CADTRKVPU) SSPATCHNNI MKQTMVDSSC RILTSDVFQD CNKLVDPEPY 1080 

U5VCIYDTCS CESIGDCACF CDTIAAYAHV CAQHGKWTW RTATLCPQSC EERNLRENGY 1140 

ECEWRYNSCA PACQVTCQHP EPIiACPVQCV EGCHAHCPPG KlUDEhUQTC VDPEDCPVCE 1200 
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10 



15 



20 



25 



VAGRRFASGK 
DISEPPLHDF 
YHDGSHAYTG 
ALLLMASQEP 
SSVDEIiEQQR 
FVLBGSDKIG 
ILQRVRBIRY 
GDIQWPIGV 
SPAPDCSQPI. 
IDVPWNWPB 
TDVSVDSVDA 
VTLGNSFLHK 
RGLRPSCPNS 
EQDLBVTLHN 
NVYGAIMHEV 
GTVTTDWKTL 
AICQQDSCHQ 
DGNVSSCGDH 
CTCLSGRKVN 
ERGLQPTLTN 
STVSCPLGYL 
KGLRVAQCSQ 
WASPENPCLX 
ACMLNGTVIG 
CGRCLPTACT 
CLAEGGKIMK 
SIDINDVQDQ 



KVTLNPSDPE 
YCSRLLDLVF 
LKDRKRPSEh 
QRMSRNFVRY 
DEIVSYLCDL 
EADFNRSKEF 
QGGNRTNTGL 
GPNANVQELE 
DVILLLDGSS 
KAHLItSLVDV 
AADAARSNRV 
LCSGPVRICM 
QSPVKVEETC 
GACSPGARQG 
RFNHLGHIFT 
VQEWTVQRPG 
BQVCEVIASY 
PSBGCFCPPD 
CTTQPCPTAK 
PGECRPNFTC 
ASTATNDCGC 
KPCEDSCRSG 
NECVRVKEEV 
PGKTVMIDVC 
IQLRGGQIMT 
IPGTCCDTCE 
CSCCSPTRTE 



HCQICHCDW 
LLDGSSRIiSE 
RRIASQVKYA 
VQGIiKKKKVI 
APEAPPPTLP 
MEEVIQRMDV 
ALRYLSDHSF 
RIGWPNAPIL 
SFPASYFDEM 
MQREGGPSQI 
TVFPIGIGDR 
DEBGNEKRPG 
GCRWTCPCVC 
OSKSIEVKHS 
FTPQNNEFQIi 
QTCQPILEEQ 
AHLCRTNGVC 
KVMLBGSCVP 
APTCGLCEVA 
ACRKEECKRV 
TTTTCLPDKV 
FTYVLHEGEC 
FIQQRNVSCP 
TTCRCMVQVG 
LKRDETXiQDG 
EPECNDITAR 
PMQVALHCTN 



NLTCEACQEP 
AEFEVLKAFV 
GSQVASTSEV 
VIPVGIGPHA 
PHMAQVTVGP 
GQDSIHVTVL 
LVSQGDREQA 
IQDFETLPRE 
KSFAKAFISK 
GDALGFAVRY 
YDAAQLRILA 
DVWTLPDQCH 
TGSSTRHIVT 
ALSVELHSDM 
OLSPKTFASK 
CLVPDSSHCQ 
VDWRTPDFCA 
EEACTQCIGE 
RLRQNADQCC 
SPPSCPPHRL 
CVHRSTIYPV 
CGRCLPSACB 
QLEVPVCPSG 
VTSGFKLECR 
CDTHFCKVNE 
LQYVKVGSCK 
GSWYHEVLN 



GGLWPPTDA 
VDMMERLRIS 
LKYTLFQIFS 
NLKQIRLIEK 
GLLGVSTLGP 
QYSYMVTVEY 
PNLVYMVTGN 
APDLVLQRCC 
ANIGPRLTQV 
LTSEMHGARP 
GPAGDSNWK 
TVTCQPDGQT 
FDGQNFKLTG 
EVTVNGRLVS 
TYGLCGICDE 
VLLLPLFAEC 
MSCPPSLVYN 
DGVOHQFLEA 
PEYECVCDPV 
PTLRKTQCCD 
GQFWEEGCDV 
WTGSPRGDS 
FQLSCKTSAC 
KTTCNPCPIjG 
RGEYFWEKRV 
SEVEVDIHYC 
AMECKCSPRK 



PVSPTTLYVE 
QKWVRVAWE 
KIDRPEASRI 
QAPENKAFVL 
KRNSMVLDVA 
PFSEAQSKGD 
PASDBXKRLP 
SGEGLQIPTL 
SVLQYGSITT 
GASKAWILV 
LQRIEDLPTM 
LLKSHRVNCD 
SCSYVLFQNK 
VPYVGGNMEV 
NGANDFMLRD 
HKVLAPATFY 
HCEHGCPRHC 
WVPDHQPCQl 
SCDLPPVPHC 
BYECACNCVN 
CTCTDMEDAV 
QSSWKSVGSQ 
CPSCRCERME 
YKEENNTGEC 
TGCPPFDEHK 
QGKCASKAMY 
CSK 



1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
22S0 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 



30 Seq ID NO: 150 DNA sequence 

Nucleic Acid Accession #: NM_001508.1 

Coding sequence: 1-1362 (underlined sequences correspond to start and stop codons) 



35 



40 



45 



50 



55 



60 



ATGGCTTCAC 
CCCGAGTTTG 
TTCGTGATGG 
AAAGGATACT 
TTGGTGTTCC 
ACGTCCAGCT 
GCTACGCTGC 
TTCAGGTACA 
GTCACCTCCG 
GTGAACGTGC 
CAGCCCGAGA 
CAGTCCAGCA 
ATGTGCTGGA 
ACGCGGCCTC 
ACCATCATCT 
ATTCGGAGGA 
GCGTACATGA 
CCGCTCCTGT 
TGCCGCCTGT 
ACCACCGACA 
TCTGCAAGGA 
TCTAAGTCCC 
AATTCTGCTG 



11 

I 

CCAGCCTCCC 
AGGTGGCCAC 
GCCTTCTGGG 
TGCAGAAGGA 
TCATCGGCAT 
ACACCCTGTC 
TGCACGTGCT 
AGGCTGTGTC 
CCCTGGTGGC 
CCAGCCACCG 
CCTCCAATAT 
TCTTCGGCGC 
ACATGATGCA 
CGCAGCTGAG 
TCCTGAGGCT 
TCATGGCTGC 
TCCTCCTCCC 
ACACGGTGTC 
CGCTGCAGCA 
GOGCCOGCTT 
GAACTGAGAA 
AGTCATTGAG 
CAGAGAATGG 



21 
I 

GGGCAGTGAC 
CTGGATCAAA 
GAACAGCGTC 
GGTGACAGAC 
GCCCATGGAG 
CTGCAAGCTG 
GACGCTCAGC 
GGGACCTTGC 
ACTGCCCTTG 
GGGTCTCACT 
GTCCATCTGT 
CTTCGTGGTC 
GGTGCTCATG 
GAAGTCCGAG 
GATTGTTGTG 
GGCCAAACCC 
CTTCTCGGAG 
CTCGCAGCAG 
CGCCAACCAC 
TGTGCAGCGC 
GATTTTCTTA 
TCTCGAGTCA 
TTTTCAGGAG 



31 



41 



TGCTCCCAAA TCATTGATCA 
ATCACCCTTA TTCTGGTGTA 
ACCATTOGGG TCACCCAGGT 
CACATGGTGA GTTTGGCTTG 
TTCTACAGCA TCATCTGGAA 
CACACTTTCC TCTTCGAGGC 
TTTGAGCGCT ACATGGCCAT 
CAGGTGAAGC TGCTGATTGG 
CTGTTTGCCA TGGGTACTGA 
TGCAACCGCT CCAGCACCCG 
ACCAACCTCT CCAGCCGCTG 
TACCTCGTGG TCCTGCTCTC 
AAAAGCCAGA AGGGCTCGCT 
AGCGAAGAGA GCAGGACCGC 
ACATTGGCCG TATGCTGGAT 
AAGCACGACT GGACGAGGTC 
ACGTTTTTCT ACCTCAGCTC 
TTTCGGOGGG TGTTCGTGCA 
GAGAAGCGCC TGCGOGTACA 
CCGTTGCTCT TCGOGTCCCG 
AGCACTTTTC AGAGCGAGGC 
CTAGAGCCCA ACTCAGGOGC 
CATGAAGT TT GA 



51 
I 

CAGTCATGTC 
CCTGATCATC 
GCTGCAGAAG 
CTCGGACATC 
TCCCCTGACC 
CTGCAGCTAC 
CTGTCACCCC 
CTTCGTCTGG 
GTACCCCCTG 
CCACCACGAG 
GACCGTGTTC 
CGTAGCCTTC 
GGCCGGGGGC 
CAGGAGGCAG 
GCCCAACCAG 
CTACTTCCGG 
GGTCATCAAC 
GGTGCTGTGC 
TGCGCACTCC 
GCGCCAGTCC 
CGAGCCCCAG 
GAAACCAGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 



Seq ID No: 151 Protein sequence: 
Protein Accession #: NP 001499.1 



65 



70 



75 



MASPSLPGSD 
KGYLQKEVTD 
ATLLHVLTLS 
VNVPSHRGLT 
MCWNMMOVLM 
IRRIMAAAK.P 
CRLSLOHANH 
SKSQSLSLES 



11 

I 

CSQIIDHSHV 
HMVSLACSDI 
FERYIAICHP 
CNRSSTRHHS 
KSQKGSLAGG 
KHDWTRSYFR 
EKRLRVHAHS 
LEPNSGAKPA 



21 
I 

PEFEVATWIK 
LVFIilGMPME 
FRYKAVSGPC 
QPETSNMSIC 
TRPPQIiRKSE 
AYMXLLiPFSS 
TTDSARFVQR 
NSAAENGFQB 



31 
I 

ITLILVYLII 
FYSIIWNPLT 
QVKLLIGFVW 
TNliSSRWTVF 
SEESRTARRQ 
TFFYLSSVTN 
PLLFASRRQS 
HEV 



41 

I 

FVMGLLGNSV 
TSSYTLSCKL 
VTSALVALPL 
QSSIFGAFW 
TIIFLRLIW 
PLLYTVSSQQ 
SARRTEKIFL 



51 
I 

TIRVTQVLQK 
HTFIiFEACSY 
LFAMGTBYPL 
YLWLLSVAF 
TLAVCWMPNQ 
FRRVFVQVLC 
STFQSEAEPQ 



60 
120 
180 
240 
300 
360 
420 



Seq ID NO: 152 DNA sequence 
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Nucleic Acid Accession #: none found 

Coding sequence: 3-65 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

TTATTATTTT GTGTAAACTA TATTCTGCTT 

CTTGAAAAGT ATTCCAAGGA ATATTATGAA 

CATGTAATTG AAATTCATGC AAGGAAACAA 

CACATGTTAT CAACCTCGTA ACTTTTGGTG 

CCCGCTCTTT GGGAATGCTA CATACCCATT 

TAAATGACAA AACACAGCAG TGTTTTGAGG 

ACATACTACC AGAGCTGCTT GGAAAATTAA 

ATACGTTTAA ATTTGCCCTA GGATTSAGCT 

GAGAGAAAGA AAGGAGKAAA CAGTGGTAAT 

AATATTAATA AATCATAATA TGACAAGACC 

GAAGCCAGCT GTTGGTAGGC ATTAATGAGT 

CACTATTTTG CCAAAACCAA AGTAATTATA 

GTGGTTATAA AGGGCATATT TACATAAATT 

GTAACCCAAT TTTACTTCTT TAAAAAGTCT 

AATCAACTAG ACAGTGGTTT GTTAAATTTA 

GTCATGAGTT CTTGAATCCC AGAGAAATAA 

TTGGCCTAGA GAAGTGGCCA TTTTATCAAC 

CCCGTTGCCT TCTGAAAAAC AGCAAGTTAT 

ATGGAAAATT AATTTATTAA TTAATAGCCT 

ATATTTTGAG ATAAAATGTT GAATAAAACC 

TGTAATATTT AATTATTTTA TAAGTTTTAT 



31 . 41 51 

I I I 

ATAGAGAGTC TCTGAGACTA AAATTGACAA 60 

AATAGGGCAA CATGGACTGT TTAAGATCTC 120 

CTCATAGAAA AGATAAATAT GGATGCCCTT 180 

CTTGCTGAAT CAGTCCATGA AAAGCTACAG 240 

TCTGGTATTT AAAAAATATC TAGGAGGAGC 300 

GAGAAAGGAC CATCATTTAT AATGCTCTGT 360 

AGGCCACTTG TGGCTTTTTC CTACCAACTG 420 

AACAGCAAAA AAAAAAAAAA AAAAAAAARA 480 

AAAAAAATCC ATCTGTCTTC TTGCTATGTT 540 

CTCACTGAAT AAGAGTATTT TCAGTCATCA 600 

TTAAAATTGT TCTCAATTGA AAAAACATCA 660 

ATACTGTGTC CTCCTGTAAT TTTTTGAGAA 720 

CTACTTTATT CCTCAACTTC TTTGATGAAT 780 

CAATTCAAGC TGGATTAGCC AGCTCAGCAT 840 

GCAGCATACT TCGTTCCCAT TCTAATTAAA 900 

TGCTTAGGAA CTTCTCTCAA TCTGCTTGGC 960 

AGGRAAAAAA AAAATTTTCT CTACTACAAC 1020 

TTCTTTATAT AATTATCATT TTATTATTTT 10 B0 

ATTATGTGTT CTCACTTGCT TCTCTAAGTA 1140 

ATGGATTATA GAGAAAAGTC AAAATATATG 1200 
AATAAAGTAT TCCATTTCTT TATCTT 



Seq ID No: 153 Protein sequence: 
Protein Accession #: none found 



1 11 21 

I I I 

IILCKLYSAY RESLRLKLTT 

Seq ID NO: 154 DNA sequence 
Nucleic Acid Accession #: none found 

Coding sequence: 1-36 (underlined sequences correspond to start and stop codons) 



31 41 51 

1 I I 



1 11 21 31 41 51 

I I ! I I I 

CTGGATGATA TGGAAGAAAT GGATGGGTTA AGGTAAAAGG CTGATCACAG ATGGGTTCCT 60 

CTCAAGGTTA AAATAGTTTA AGTGCCAGAA GAAAAGGTGG GCACCAGCGA ATTAAGAACC 120 

ATCTTTGAAT GGTCCCCTTG GTTAAATACT TAACTTTTGT CATCAGTGTC TGCATTTATG 180 

AAATGAAGAG GAATTCACTA ATATGCTACG TGATCTTTTG TTTGTCATGA AAAGAGTTAC 240 

TGTTGTGTAG TTCTCTGTTC CAGGGCTGCC TTTGCTCCAC AAAGCACTGA GAAGCAGTGG 300 

CCCTGTACAA CCATACTGCC TCTCAACACT GTGTAATAGG CTAACACCGC CCAGCGAACC 360 
TTCCTGGGAG ATATAAAATA CATAGGTTTA GGCTGGCAAA AAAAAAAAAA AAA 

Seq ID No: 155 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

1 I I I I I 

LDDMEEMDGL R 



Seq ID NO: 156 DNA sequence 

Nucleic Acid Accession #: NMJ>32961.1 

Coding sequence: 827-3949 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CAGGCTCAGA GGCTGAAGCA GGAGGAAGGA AGGACTGGAA GGAAAAAGAG ACAGGTTAGA 60 

GGGAAAGAGG CTTGGGAAGA AAACAGCAGA AAAGAAACTG CTCATTACAC TTACAGAGAG 120 

GCAAGTAACG GTGGAGATGA GGACAGAGGG AACCAAGACT CTGAAAGACA AAAAATACAA 180 

ATAGAGCGAA AGAGGAAAAA AATGTCAAGA AGAACATCCA TCCGGAGAAA TGAAGAGAAT 240 

GAAAGTTTTA AACTGCAGAG CCGTTCTGTG CTTTTCCGGC ACAAAATTAT ATCGCTGATT 300 

TTAAGCCCTT TTGCATTTGC CAGCCGTTGA CATTAAGAGG CATGTTTAAC GGTGCCAACA 360 

GCATCTCCTT TTCCTTCTCC TCTTCCTCTT CTTCTTCTTC CTCCTCCTCC TCCTCTTTTT 420 
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CCTCCTCCTC GTTCTCCTCC CATCAGCAAG 
GAAATTTCCT CTTTGGGATT TGCCAGCGCC 
TGTATTATTG TTATTTTATT AATTAGTCAG 
CCTGTCACCC TTCCTGTGCT AAGATTTAAA 
5 AAAATGAAGC AAAAGGAGTA AGATTTTTAA 
OGCACTTTTA TTTGTATTTT TTCAGATTTT 
TGGGTGGCTG ACTGGCTGCG GGAAGCTACT 
ATTGTTTGCC TTGCTCTGGA TGGTGGAAGG 
GGAGGAGCAG GAACATGGCA CTTTCGTGGG 

10 TACAAAACTT TCGGCTCGCG GGTTTCAGAC 
CCTCAACCTG GAGACAGGGG TGCTGTACGT 
CAAACAGAGC CCCTCCTGTG TCCTGCACCT 
GTTCCAGGTG GAGATC6A66 TGCTGGACAT 
AGACCT6A00 GTGGAAATCT CTGAGAGOGC 

15 CGCATTCGAC CCAGACGTGG GCACCAACTC 
CTACTTCTCC CTGGACGTGC AGACCCAGGG 
GGAGAAGCCA CTGGACCGAG AGCAGCAAGC 
CGGAGGAGGT GGGGGAGGAG- TAGGAGAAGG 
CCCCCAGCAG CAGCGCACCG GCACGGCCCT 

20 CAATGTGCCC GCTTTCGACC AACCCGTCTA 
AGGCACTCTC GTGATCCAGC TCAACGCCAC 
CGTGTACTCC TTCAGCAGCC ACATTTCGCC 
GCGCACTGGC AGACTGGAGG TAAGCGGCGA 
AGTGTACGTG CAAGCCAAGG ACCTGGGCCC 

25 AGTGCGAGTA CTGGATGCTA ATGACAACGC 
AGCGGTGAGT GAGGGCGCGG CGCCCGGCAC 
CGACTCAGAG GAGAATGGGC AGGTGCAGTG 
CAAGTCTTCC TTTAAGAATT ACTACACCAT 
GGGGGACTCC TACACCCTGA CTGTAGTGGC 

30 CAGTAAGTCG ATCCAGGTAC AAGTGTCGGA 
GCCGGTCTAC GACGTGTATG TGACTGAAAA 
GAGCGCCACC GACCGGGATG AGGGCGCCAA 
CCAGATCCAG GGCATGAGCG TCTTCACCTA 
GTAOGCCCTG CGCTCCTTCG ACTATGAGCA 

35 CCGGGACGCT GGCAGCCCCC AGGCGCTGGC 
GGATCAAAAT GACAACGCCC CTGCCATCGT 
AGOGCGTGAG GTGCTGCCCC GCTCGGCGGA 
CGTGGACGCG GACGACGGCG AGAACGCCCG 
AATGAACCTC TTTCGCATGG ACTGGCGCAC 

40 GGCCAAGCGC GACCCCCAGC GGCCTTATGA 
GCCGCCCCTT TCCTCCACCG CCACCCTGGT 
CCAGGGCGGG GGCGGGAGCG GAGGCGGAGG 
TGGCGGOGGG GAAACCTCGC TAGACCTCAC 
GTCCTTCATC TTCCTGCTGG CCATGATCGT 

45 GCTCAACATC TATACTTGTC TGGCCAGCGA 
CGGAGGTTCG ACCTGCTGTG GCCGCCAAGC 
AGACATCATG CTGGTGCAGA GCTCCAATGT 
GGAGTCCGGG GGCTTTGGCT CCCACCACCA 
GACCCCTGAG TCCGCCAAGA CCGACCTGAT 

50 TACGGACACT GAGCACAACC CCTGCGGGGC 
TGATATCATC TCCAACGGAA GCATTTTGTC 
CAGCTATCTA GTTGACAGAC CTOGCCGAGT 
AGTAAGCTCT AAGGACAGTG GTCATGGAGA 
CACCAACCGT GCCCAGTCAG CTGGTATGGA 

55 AGCTCTGGGC CACTCAGATC GGTGCTGGAT 
GGCTGCTGAT TATCGCAGCA ATCTGCATGT 
GGTGTTTGAA ACTCCAGAAG CCCAGCCTGG 
AGAGAAGGCC CTTCACAGCA CTCTGGAGAG 
GCGAGCGCCT TACAAACCAC CATATTTGAC 

60 GGACTTACCT GAAGCAGCAT GATTTGCACA 
ACTTCATTAT CTTGGCCATC CAGTTAGTCA 
GTCATCATGG CCAATTATAG GACCTAATTG 
TGTGCAGAAC TGTAGAAACT TTAGAGGCAA 
TGTTTACAGC ACTATCTATC TTTCTCTCTC 

65 ATTCACCACG AGAAGCCAGT CATAAAGATA 
CACTGTTTTA AACTTGACTG TTTTATATTA 
TTCCAACTTT ACAAGAGAAA TTGTGATTAT 
TTGTATTCTG AAGACCCACA AAATATCAAA 
AAGTGTTTAC TGTACTATTT CAAAGCTTCT 

70 TATAATTTTC CTAAAATGTG GTACAACTCA 
CATCATACAA TAAAATAAAA GGTAATTCAG 
TCATTAATAG TTTTCTCCCA ATTTCCATAT 
AGAAAATGAT GCTCTAAGCT ACAAAATTTT 
AAAGATGTAG CTATTGATGT TATCAGACAG 

75 ACAATCTGCA TAAGTCTGAT TCTATTTCTA 
TTATAAAGAA TCGATAAATT CACCTGTATT 



AAGACAAACC GAGGACAGTC TTGAAATATC 480 

AAGACTGTCG GAATAAAGGA CGCTGACTAT 540 

TGGAAAGATT ACAGATGAGG AAAGGGGACG 600 

AAAAAATGAG GCTGGATTGC GGGAAGCTCT 660 

AGACAGAAAG CCACAGGAGC CCCCACGTAG 720 

TTTTTGTTTC GTGGTGGTGG GGGAGGTGAT 780 

TCCTTTCCTT TTGGAGATGA TTGTGCTATT 840 

AGTCTTTTCC CAGCTTCACT ACACGGTACA 900 

GAATATCGCT GAAGATCTGG GTCTGGACAT 960 

GGTGCCCAAC TCAAGGACCC CTTACTTAGA 1020 

GAACGAGAAA ATAGACCGCG AACAAATCTG 1080 

GGAGGTCTTT CTGGAGAACC CCCTGGAGCT 1140 

TAATGACAAC CCCCCCTCTT TCCCGGAGCC 1200 

CACGCCAGGC ACTCGCTTCC CCTTGGAGAG 1260 

CTTGCGCGAC TACGAGATCA CCCCCAACAG 1320 

GGATGGCAAC CGATTCGCTG AGCTGGTGCT 1380 

GGTGCACCGC TACGTGCTGA CCGCGGTGGA 1440 

AGGGGGAGGT GGCGGGGGAG CAGGCCTGCC 1500 

ACTCACCATC CGAGTGCTGG ACTCCAATGA 1560 

CACTGTGTCC CTACCAGAGA ACTCTCCCCC 1620 

CGACCCGGAC GAGGGCCAGA ACGGTGAGGT 1680 

CCGGGCGCGG GAGCTTTTCG GACTCTCGCC 1740 

GTTGGACTAT GAAGAGAGCC CAGTGTACCA 1800 

CAACGCCGTG CCTGCGCACT GCAAGGTGCT 1860 

GCCAGAGATC AGCTTCAGCA CCGTGAAGGA 1920 

TGTGGTGGCC CTTTTCAGCG TGACTGACCG 1980 

CGAGCTACTG GGAGACGTGC CTTTCCGCCT 2040 

CGTTACCGAA GCCCCCCTGG ACCGAGAGGC 2100 

TCGGGACCGG GGCGAGCCTG CGCTCTCCAC 2160 

TGTGAACGAC AACGCGCCGC GTTTCAGCCA 2220 

CAACGTGCCT GGCGCCTACA TCTACGCGGT 2280 

CGCCCAGCTT GCCTACTCTA TCCTCGAGTG 2340 

CGTTTCTATC AACTCTGAGA ACGGCTACTT 2400 

GCTGAAGGAC TTCAGTTTTC AGGTGGAAGC 2460 

TGGTAACGCC ACTGTCAACA TCCTCATAGT 2520 

GGCGCCTCTA CCAGGGCGCA ACGGGACTCC 2580 

GCCGGGTTAC CTGCTCACCC GCGTGGCCGC 2640 

GCTCACTTAC AGCATCGTGC GTGGCAACGA 2700 

OGGGGAGCTG CGCACAGCAC GCCGAGTCCC 2760 

GCTGG TGATC GAGGTGCGCG ACCATGGGCA 2820 

GGTTCAGCTG GTGGATGGCG CCGTGGAGCC 2880 

GTCAGGAGAG CACCAGCGCC CCAGTCGCTC 2940 

CCTCATCCTC ATCATCGCGT TGGGCTCGGT 3000 

GCTGGCCGTG CGTTGCCAAA AAGAGAAGAA 3060 

TTGCTGCCTC TGCTGCTGCT GCTGCGGTGG 3120 

CCGGGCGCGC AAGAAGAAAC TCAGCAAGTC 3180 

ACCCAGTAAC CCGGCCCAGG TGCCGATAGA 3240 

CAACCAGAAT TACTGCTATC AGGTATGCCT 3300 

GTTTCTTAAG CCCTGCAGCC CTTCGCGGAG 3360 

CATCGTCACC GGTTACACCG ACCAGCAGCC 3420 

CAACGAGACT AAACACCAGC GAGCAGAGCT 3480 

TAACAGTTCT GCATTCCAGG AAGCCGACAT 3540 

CAGTGAACAG GGAGATAGTG ATCATGATGC 3600 

TCTCTTCTCC AATTGCACTG AGGAATGTAA 3660 

GCCTTCTTTT GTCCCTTCTG ATGGACGCCA 3720 

TCCTGGCATG GACTCTGTTC CAGACACTGA 3780 

GGCAGAGCGG TCCTTTTCCA CCTTTGGCAA 3840 

GAAGGAGCTG GATGGACTGC TGACTAATAC 3900 

ACGGAAAAGG ATATG CTAG T CAATTCTACA 3960 

AAGTCGACCA ACAAAAGCAT CAACTTTTCA 4020 

TGTGTAACTG AGTATTAGAT TTCGGATGGA 4080 

CTCTCAGCAG GCCTGAGAAA TGAGTTGAAA 4140 

CAGATTTTGC CTCCCCGATC AGTGTGTGCC 4200 

CAAATGTCAC TGAGCCCTTT AGATGTTTAT 4260 

AAGGAAATTT GTGCATTATA AATGCAATAT 4320 

TTTTTGTGTG ATCAAGTGTT CCGCAAGCTA 4380 

GTTCTTTTCA CCTGTGGGTT ATAAAAAATG 4440 

GACATTCTGT AGTTTATACA CCGTGTTGCA 4500 

AAATAAATAT AAAATATATA TATTATATTA 4560 

GTTGGTTTTT AAATGGATGC ATACAGTCCA 4620 

GGTCCCAAAG ACAAACTTAC TAAGAAAAAA 4680 

CTTACTCAAC CGTGTTTTTC CTTGTTTAAA 4740 

GTCAAAAACT CATATTGAAT TTTCAATGCC 4800 

AGCACTGACT ATGTACTATC AAACTATCTA 4660 

TGACTTTGAA TTTAGAATCA CTTAAAGCTT 4920 

TGTTGTTAGA AAAAAACTGG GTGTCTGTAC 4980 
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ATTTTGTGGT 
ACTCACACAG 
TATAATTCTA 
AAAATATGAA 
TTCAGTATTC 
AATTCTTCAT 
TTAATTATTA 



GTAAAATATG 
AATTTTATTT 
TATGAATATA 
AGCTCTAAAT 
CATGTAAAAA 
GAGTCAGCCT 
AATTTAGTAA 



TAATTGAAGA 
TACATAGTTT 
TAGAGATATA 
TTAAAATAAA 
TTTTATAGCT 
TCAAAAGTTA 
GACGCAAAAA 



TTACTATTTT 
TGTGACTTAA 
GAAACATCTG 
TTTAGAGATA 
TAAATGTAGT 
AGCTTGCCTT 
AAAAAAAAAA 



AAGAAGTCAT 
TTACACATGA 
AACTGGTAAA 
GAATCATGGT 
CAGTGTTTGA 
TTACTTTTAT 
AAAA 



CAGTCATATC 
ATATAAAATC 
GAATAACTAT 
ACATTATTGT 
TTAATGAAAA 
GTCAACAATA 



5040 
5100 
5160 
5220 
5280 
5340 



10 



Seq ID No: 157 Protein sequence: 
Protein Accession #: NP 116586.1 



15 



20 



25 



30 



i 
I 

MIVLLLFALL 
TPYLDLNLET 
SFPEPDLTVE 
AELVLEKPLD 
LDSNDNVPAF 
FGLSPRTGRL 
STVKEAVSBG 
LDREAGDSYT 
YIYAVSATDR 
FQVEARDAGS 
TRVAAVDADD 
RDHGQPPLSS 
AIiGSVSFIFL 
KLSKSDIMLV 
SPSRSTDTEH 
QEADIVSSKD 
SDGRQAADYR 
LLTNTRAPYK 



11 
I 

WMVEGVFSQL 
GVLYVNEKID 
ISESATPGTR 
REQQAVHRYV 
DQPVYTVSLP 
EVSGELDYEE 
AAPGTWALP 
LTWARDRGE 
DEGANAQLAY 
PQAIiAGNATV 
GENARLTYSI 
TATLWQLVD 
LAMIVIAVRC 
QSSNVPSNPA 
NPCGAIVTGY 
SGHGOSBQGD 
SNLHVPGMDS 
PPYLTRKRIC 



21 
I 

HYTVQEEQEH 
REQICKQSPS 
FPLESAFDPD 
LTAVDGGGGG 
ENSPPGTLVI 
SPVYQVYVQA 
SVTDRDSEEN 
PALSTSKSIQ 
SILECQIQGM 
NILIVDQNDN 
VRGNEMNLFR 
GAVEPQGGGG 
QKEKKLNIYT 
QVPIEESGGP 
TDQQPDIISN 
SDHDATNRAQ 
VPDTEVFETP 



31 
i 

GTFVGNIAED 
CVLHLEVFLE 
VGTNSLRDYE 
GVGEGGGGGG 
QLNATDPDEG 
KDLGPNAVPA 
GQVQCELLGD 
VQVSDVNDNA 
SVFTYVSINS 
APAIVAPLPG 
MDWRTGELRT 



CLASDCCLCC 
GSHHHNQNYC 
GSILSNKTKH 
SAGMDLFSNC 
EAQPGAERSF 



41 

I 

LGLDITKLSA 
NPLELFQVEI 
ITPNSYFSLD 
GAGLPPQQQR 
QNGEWYSFS 
HCKVLVRVLD 
VPFRLKSSFK 
PRFSQPVYDV 
ENGYLYALRS 
RNGTPARBVL 
ARRVPAKRDP 
RPSRSGGGET 
CCCGGGGSTC 
YQVCLTPESA 
QRAELSYLVD 
TEECKALGHS 
STFGKEKALH 



51 

I 

RGFQTVPNSR 
EVLDINDNPP 
VQTQGDGNRF 
TGTALLTIRV 
SHI SP RARE L 
ANDNAPEISF 
NYYTIVTEAP 
YVTENNVPGA 
FDYEQLKDFS 
PRSAEPGYLL 
QRPYELVIEV 
SLDLTLILII 
CGRQARARKK 
KTDLMFLKPC 
RPRRVNSSAF 
DRCWMPSPVP 
STLERKELDG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



Seq ID NO: 158 DNA sequence 
35 Nucleic Acid Accession #: NM 022159.1 

Coding sequence: 70-1890 (underlined sequences correspond to start and stop codons) 



1 11 21 31 

40 | | | | 

GTGAAATTTA AACTCCAGTC CTGTGGCGAA AATGCTAATT 
TATTATTGTA TGT GTGTACC TGGCTTCAGA TCCAGCAGTA 
AATGATGGAA CCGTCTGTAT AGAAAATGTG AATGCAAACT 
ATAGCTGCAA ATATTAATAA AACTTTAACA AAAATCAGAT 

45 TTGCTACAAG AAGTCTATAG AAATTCTGTG ACAGATCTTT 
TATATAGAAA TATTAGCTGA ATCATCTTCA TTACTAGGTT 
GCCAAGGACA CCCTTTCTAA CTCAACTCTT ACTGAATTTG 
GTTCAAAGGG ATACATTTGT AGTTTGGGAC AAGTTATCTG 
CTTACAAAAC TCATGCACAC TGTTGAACAA GCTACTTTAA 

50 AAGACCACAG AGTTTGATAC AAATTCAACG GATATAGCTC 
TCATATAACA TGAAACATAT TCATCCTCAT ATGAATATGG 
TTTCCAAAGA GAAAAGCTGC ATATGATTCA AATGGCAATG 
TATAAGAGTA TTGGTCCTTT GCTTTCATCA TCTGACAACT 
TATGATAATT CTGAAGAGGA GGAAAGAGTC ATATCTTCAG 

55 TCAAACCCAC CCACATTATA TGAACTTGAA AAAATAACAT 
GTCACAGATA GGTATAGGAG TCTATGTGCA TTTTGGAATT 
GGCAGCTGGT CTTCAGAGGG CTGTGAGCTG ACATACTCAA 
CGCTGTAATC ACCTGACACA TTTTGCAATT TTGATGTCCT 
AAAGATTATA ATATTCTTAC AAGGATCACT CAACTAGGAA 

60 CTTGCCATAT GCATTTTTAC CTTCTGGTTC TTCAGTGAAA 
ATTCACAAAA ATCTTTGCTG TAGCCTATTT CTTGCTGAAC 
AATACAAATA CTAATAAGCT CTTCTGTTCA ATCATTGCCG 
TTAGCTGCTT TTGCATGGAT GTGCATTGAA GGCATACATC 
GTCATCTACA ACAAGGGATT TTTGCACAAG AATTTTTATA 

65 GCCGTGGTAG TTGGATTTTC GGCAGCACTA GGATACAGAT 
TGTTGGCTTA GCACCGAAAA CAACTTTATT TGGAGTTTTA 
ATTCTTGTTA ATCTCTTGGC TTTTGGAGTC ATCATATACA 
GGGTTGAAAC CAGAAGTTAG TTGCTTTGAG AACATAAGGT 
GCTCTTCTGT TCCTTCTCGG CACCACCTGG ATCTTTGGGG 

70 TCAGTGGTTA CAGCTTACCT CTTCACAGTC AGCAATGCTT 
TTATTCCTGT GTGTTTTATC TAGAAAGATT CAAGAAGAAT 
GTCCCCTGTT GTTTTGGATG TTTAAGGTAA ACATAGAGAA 
ACAAAAATAA AAATTCCAAG CTGTGGATGA CCAATGTATA 
CCAATTATTA ACTACTAGAC AAAAAGTATT TTAAATCAGT 

75 AACTGTAGAT AATAAGGTAA AATTATGTAT CATATAGATA 
AATAGTTCTG TCAAAAATAG TATTGCAGAT ATTTGGAAAG 



41 

I 

GCACTAACAC 
ACCAAGACAG 
GCCATTTAGA 
CCATAAAAGA 
CACCAACAGA 
ACAAGAACAA 
TAAAAACCGT 
TGAATCATAG 
GGATATCCCA 
TCAAAGTTTT 
ATGGAGACTA 
TTGCAGTTGC 
TCTTATTGAA 
TAATTTCAGT 
TTACATTAAG 
ACTCACCTGA 
ATGAGACCCA 



TAATTATTTC 
TTCAAAGCAC 
TTGTTTTTCT 
GACTGCTACA 
TCTATCTCAT 
TCTTTGGCTA 
ATTATGGCAC 
TAGGACCAGC 
AAGTTTTTCG 
CTTGTGCAAG 
TTCTCCATGT 
TCCAGGGGAT 
ATTACAGATT 
TGGTGGATAA 
AAAATGACTC 
TTTTCTGTTT 
TACTATCPPT 
TAATTGGTTT 



51 
I 

AGAAGGAAGT 

GTTTATCACT ' 

TAATGTCTGT 

ACCTGTGGCT 

TATAATTACA 

CACTATCTCA 

GAATAATTTT 

GAGAACACAT 

GAGCTTCCAA 

CTTCTTTGAT 

CATAAATATA 

ATTTTTATAT 

ACCTCAAAAT 

CTCAATGAGC 

TCATCGAAAG 

TACCATGAAT 

CACCTCATGC 

CATTGGTATT 

ACTGATTTGT 

CAGGACAACA 

TGTTGGGATC 

CTACTTCTTT 

TGTTGTGGGT 

TCTAAGCCCA 

AACCAAAGTA 

ATGCCTAATC 

TCACACTGCA 

AGGAGCCCTC 

TGTGCACGCA 

GTTCATTTTT 

GTTCAAAAAT 

TTACAACTGC 

ATCAAATTAT 

ATGCTATAGG 

TTCTATGTGA 

CTCAGGAGTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
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ATATCACTGC ACCCAAGGAA AGATTTTCTT TCTAACACGA GAAGTATATG AATGTCCTGA 2220 

AGGAAACCAC TGGCTTGATA TTTCTGTGAC TCGTGTTGCC TTTGAAACTA GTCCCCTACC 2280 

ACCTCGGTAA TGAGCTCCAT TACAGAAAGT GGAACATAAG AGAATGAAGG GGCAGAATAT 2340 

CAAACAGTGA AAAGGGAATG ATAAGATGTA TTTTGAATGA ACTGTTTTTT CTGTAGACTA 2400 

GCTGAGAAAT TGTTGACATA AAATAAAGAA TTGAAGAAAC ACATTTTACC ATTTTGTGAA 2460 

TTGTTCTGAA CTTAAATGTC CACTAAAACA ACTTAGACTT CTGTTTGCTA AATCTGTTTC 2520 
TTTTTCTAAT ATTCTAAAA 

Seq ID No: 159 Protein sequence: 
Protein Accession #: NPJ)71442.1 



1 11 21 31 41 51 

I I I I I I 

MCVPGFRSSS NQDRFITNDG TVCIENVNAN CHLDNVCIAA NINKTLTKIR SIKEPVAIiLQ 60 

BVYRNSVTDL SPTDIITYIE ILAESSSLLG YKNNTISAKD TLSNSTLTEF VKTVNNFVQR 120 

DTFWWDKLS VNHRRTHLTK LMHTVEQATL RISQSPQKTT EFDTNSTDIA LKVFFPDSYN 180 

MKHIHPHMNM DGDYINIFPK RKAAYDSNGN VAVAFLYYKS IGPLLSSSDN FLLKPQNYDN 240 

SEEEERVISS VISVSMSSNP PTLYELEKIT FTLSHRKVTD RYRSLCAFWN YSPDTMNGSW 300 

SS EGCELTYS NETHTSCRCN HLTHFAILMS SGPSIGIKDY NILTRITQLG IIISLICIAI 360 

CIFTFWFFSE IQSTRTTIHK NLCCSLFLAE LVFLVGINTN TNKLFCSIIA GLLHYFFLAA 420 

FAWMCIEGIH LYLIWGVIY NKGFLHKNFY IFGYLSPAW VGFSAALGYR YYGTTKVCWL 460 

STENNFTWSF IGPACIiIILV NLLAFGVIIY KVFRHTAGLK PEVSCFENIR SCARGAIALIj 540 

FLLGTTWIFG VLHWHASW TAYLFTVSNA FQGMFIFLFL CVLSRKIQEE YYRLFKNVPC 600 
CFGCLR 



Seq ID NO: 160 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-216 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

TGT CTGCTTA TGCGGTGGCT CGCTGCTCAG AACAGGATGG CAGAGATGAG CACCACCATC 60 

AAAAACTCAA GGACCAGTGC TGTGGGTCCA GTCATCTGTT TCATGGAATT CACCAGTCTG 120 

GTATCTTCAA AATCCAGAAG GATGATGGCA GATGGCAGGA AGGAGGAAGA GGGTAATCTG 180 

GAAGAGTTTC CTGACCTACT CTGCTGCTGT GA TTAAA CAA CCACCAGGAA ATTTTGATGA 240 

CACTGTTCTC CTGAGCTCCT CCCTTTCCTC GGGGAAGAAA AGCATTGAAA CTACAAAAAT 300 

AAAGTGTTAT TTGGCTGGAG TGAGGTCTCA TGTCTGCTTA TGCGGTGGCT CGCTGCTCAG 360 

AACAGGGAAC CATTGGAGAT ACTCATTACT CTTTGAAGGC TTACAGTGGA ATGAATTCAA 420 

ATACGACTTA TTTGAGGAAT TGAAGTTGAC TTTATGGAGC TGATAAGAAT CTTCTTGGAG 480 

AAAAAAAGAC TGGTACTTCT GAATTAACCA AAATCACAGT ATTCTGAAGA TGATTCTACA 540 

AAGCCTGCTG TTTCTACAAA GGCTGCTGAT GATTTCTACA AAGCCTGCTG TAGTGTTGCT 600 

GTGGCCTCTG CTTAAAAAAG TAGAAAACAC ATTGATGCAG CATGTTCACC CCAACCTCCC 660 

TGCCTAAAGG CTCAGGGACC ATCTTGGAAG AGGAAGGCGC GTGAGATTGT AAGAGCCGAA 720 

TTAGGGGGAT GGAGTGTGGA GAATAAGGAC ACTTCATCTT GGATGCTCAC CTGCCAAATT 780 

GACTTCTGAT GAAAGCCAGC TCCAGAAATG TGCCTACAGT TACTACTTTC ACCTAAACCC 840 

TGCCCTTAGT CAAATCCTTC TCTTCTTCTA AGCAATCAAC TTCAATTCCT TGTATAACCC 900 

ACAGTATAAA AGGGCTTTTA TACCATTCTA TCCTATTGCA TGTAAGCCTT GGGTCTGGGA 960 

GGTAACAGTG TGGGATTCCA CCATCTCATC TCCCTGCCAC CCAAACATGC CTGCTCTTCT 1020 
TTAAGCAATA TTAAATGTTT GTACTTCA 

Seq ID No: 161 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

CLLMRWLAAQ NRMAEMSTTI KNSRTSAVGP VICFMEFTSL VSSKSRRKMA DGRKBEBGNL 60 
EBFPDLLCCC D 

Seq ID NO: 162 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-159 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I.I I I I 

GAGACCCTCC AGAGGCAGGG CCCAGGATTG AAGAGGGAAG CCCTGCTCCA CACGTGTTCA 60 

TCAGGAAGGA CCCACAGACT GCTGCTCCTG GAGGCCTCTC GGTTTATGGA TGTGTGTTTG 120 

TTCCATAAAC CCTCAGAGGG TCACCTGGAG ACCCGCTAAA ATGCAGGTTC TTGGGCCACA 180 

TCCTAGACCT TCTGACCGAC CCAGGGAGTG GGGCCCAGGA AGCTGCATTT GACAGATATC 240 

CCCGTGTGAT CATCATGCAC ACAGGAGTGA GAGAACCAGT GTTCTCCCCG GGCAGAAGGG 300 

AAGCTCGTGT GCAGGACACC TCACACCTCC TTTCCCATTC CCCTGCCAGG CTCTCCCTGC 360 

TGACATTGTT TTTGCGGGAG AGCTGTGAAT TCTGAAGATT AGGTTGCTTC TCACCCCAAG 420 
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CTCCAGAAGT CCAGGCTGAG CCAAACCAAG CTTCAAGTTG TGCCTGGACT TGGAGAACCA 480 

GGAGGTGAGG GGACTGACTA CTTGAAGATC ACATGGAGGA GGAGTCTGAT CCAGGCCCAG 540 

GCACCAAGGA AAGGCCATGC AAGGACACAG GGAGAAGGGC AGCTGTCTGT AAGCCAGAAA 600 

GAGCCTTCAC TAGAAACCAA ATCAGCCAGA ACCTTCATCT TGGACTTTCC AGCCTTCAGA 660 
GATGTGAAAA AATAAATTTC TGTTGATTAA CCTAAAAAA 



Seq ID No: 163 Protein sequence: 
Protein Accession #: none found 



11 



21 



31 



41 



51 



ETLQRQGPGL KREALLHTCS SGRTHRT.T.T.T. EASRFMDVCL FHKPSEGHLB TR 



Seq ID NO: 164 DNA sequence 

Nucleic Acid Accession #: NM_020241.1 

Coding sequence: 4-1557 (underlined sequences correspond to start and stop codons) 



GCCATGCAGA 
CTGGGGGGCG 
GACTACCTGA 
GAAGGTGCTG 
GGGGACAGGG 
TACCAGAGGA 
GGCAAACAGG 
ACGCTCTTTG 
ACCCTGCAGC 
CACGCCAATG 
CTAGCCATTG 
AAACATGACT 
CATGTCTACT 
GTGTCCCGCG 
AAGCAGTGGA 
TTCTACTTCA 
GTCCTGGCCG 
GACCTGACAC 
TCCATCTGGA 
GCCCCCGGGA 
AAGACCCACC 
CGGACCCTGA 
GGCAACCAGA 
CGGCCCAATG 
AGGGTGTGTG 
CGACGCTGGG 
CCCCCCACTC 
AGGGCCTGCC 
CGGCGAAGGT 
GCCACCCGTC 
G 



11 
I 

CCCCGCGAGC 
CCCACGGCCT 
ACCACTATCC 
ACGACCTCAA 
ACAACCTCTA 
AGCTGACCTG 
AGGGCGAGTG 
TGTGOGGTTC 
CCGTCGGAGA 
TTGCCCTCTT 
ATGCTGTCAT 
CCAAGTGGTT 
TCTTCTTCCG 
TGGCCCGAGT 
CGTCCTTCCT 
ACGTGCTGCA 
TTTTTTCCAC 
AGGTGGCAGC 
CGCCGGTGCC 
TGCAGTACAA 
CTCTGATGGA 
TGAGGCACCA 
CCGTTGTCTT 
CCAGCACCTC 
TCCACGAGCG 
GCTTCCAGAA 
TGCAGAGGGA 
OGGAAGTCAC 
GGGTGGGGCC 
CCCTTGTGAC 



21 
I 

GTCCCCTCCC 
CTTTCCTGAG 
CGTGTTTGTG 
CATCCAGCGA 
CCGCGTAGAG 
GAGATCTAAC 
TCGAAACTTC 
CAACGCCTTC 
CAACATCAGC 
CTCTGACGGG 
CTACOGCAGC 
CAAAGAGCCT 
GGAGATTGCG 
GTGCAAGAAC 
GAAGGCGCGG 
GGCTGTCACG 
GCCCAGCAAC 
TGTGTTTGAA 
GGAGGATCAG 
TGCCTCCAGC 
CGAAGCGGTG 
GCTGACTCGA 
CCTGGGTTCT 
AGGGACGTCT 
ACGATCGTGG 
GGCCCGSGGG 
AGCGGGGACA 
ATCGGCAGCA 
CCTCTGTAAA 
CTCCCCCCTC 



31 
I 

CGCCCGGCCC 
GAGCCGCCGC 
GGCAGCGGGC 
GTCCTGCGGG 
TTGGAGCCCC 
CCCAGCGACA 
GTAAAGGTGC 
AACCCGGTGT 
GGTATGGCCC 
ATGCTCTTCA 
CTCGGGGACA 
TACTTTGTCC 
ATGGAGTTTA 
GACGTGGGAG 
CTCAACTGCT 
GGCGTGGTCA 
AGCATCCCTG 
GGCCGCTTCC 
GTGCCTCGAC 
GCCTTGCCGG 
CCCTCGCTGG 
GTGGCTGTGG 
GAGGCGGGGA 
GGGCGTGTGT 
TGGCCCCAGC 
CCTCCGAGGT 
ATGCCGGGGT 
GCTGTCTAAA 
TACGGCCCCA 
TGACCTCCAG 



41 
I 

TCCTGCTTCT 
CGCTTAGCGT 
CCGGACGCCT 
TCAACAGGAC 
OCAOGTCCAC 
TAAACGTGTG 
TGCTCCTTOG 
GCGCCAACTA 
GCTGCCCGTA 
CAGCTACTGT 
GGCCCACCCT 
ATGCGGTGGA 
ACTACCTGGA 
GCTCCCCCCG 
CTGTACCCGG 
GCCTCGGGGG 
GCTCGGCTGT 
GAGAGCAGAA 
CCCGGCCCGG 
ATGACATCCT 
GCCATGCGCC 
ACGTGGGAGC 
OGGTCCTCAA 
GTCAAGTGGG 
GGCCTGGGCG 
GCCGGTTAGG 
TTCAGGCAGG 
GGGCTTGGGG 
GGGTGGTGAG 
CTGACCATGC 



51 

I 

GCTGCTGCTA 
GGCCCCCAGG 
GACCCCCGCA 
GCTGTTCATT 
GGAGCTGCGG 
TCGGATGAAG 
GGACGAGTCC 
CAGCATAGAC 
CGACCCCAAG 
TACCGACTTC 
GCGCACCGTG 
GTGGGGCAGC 
GAAGGTGGTG 
CGTGCTGGAG 
AGACTCCCAT 
CCGGCCCGTG 
CTGCGCCTTT 
GTCCCCCGAG 
GTGCTGCGCA 
CAACTTTG TC 
CTGGATCCTG 
CGGCCCCTGG 
GTTCCTCGTC 
CCACGCGTGC 
TTGGCTGAGC 
AGTTTGAACC 
AGACACGAGG 
GCCTGGGGGG 
AGAGTCCCAT 
ATGCCACGTG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
10B0 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



Seq ID No: 165 Protein sequence: 
Protein Accession #: NP 064626.1 



MQTPRASPPR 
GADDLNIQRV 
KQBGECRNFV 
ANVALFSDGM 
VYFFFREIAM 
YFNVLQAVTG 
IWTPVPEDQV 
TLMRHQLTRV 
VCVHERRSWW 



11 

I 

PAT. T.T.T.T.T.T.T. 
LRVNRTLFIG 
KVT.T.LRDEST 
LFTATVTDFL 
EFNYLEKVW 
WSLGGRPW 
PRPRPGCCAA 
AVDVGAGPWG 
PQRPGRWLSR 



21 
I 

GGAHGLFPBE 
DRDNLYRVEL 
LFVCGSNAFN 
AIDAVIYRSL 
SRVARVCKND 
IAVFSTPSNS 
PGMQYNASSA 
NQTWFLGSE 
RWGFQKARGP 



31 

I 

PPPLSVAPRD 
EPPTSTELRY 
PVCANYSIDT 
GDRPTLRTVK 
VGGSPRVLEK 
IPGSAVCAPD 
LPDDILNFVK 
AGTVLKFLVR 
PRCRLGV 



41 
I 

YLNHYPVFVG 
QRKLTWRSNP 
LQPVGDNISG 
HDSKWFKEPY 
QWTSFLKARL 
LTQVAAVFEG 
THPLMDEAVP 
PNASTSGTSG 



51 

I 

SGPGRLTPAE 
SDINVCRMKG 
MARCPYDPKH 
PVHAVEWGSH 
NCSVPGDSHF 
RFREQKSPES 
SLGBAPWILR 
RVCQVGHACR 



60 
120 
180 
240 
300 
360 
420 
480 



Seg ID NO: 166 DNA sequence 

Nucleic Acid Accession #: NM_032108.1 

Coding sequence: 39-2705 (underlined sequences correspond to start and stop codons) 



11 



21 



51 



31 41 

I I I I i I 

TCCGAGGCGT CACCTCCTCC TGTCGCCTGG CCCTCGC qftT G CAGACCCCG CGAGCGTCCC 
CTCCCCGCCC GGCCCTGCTG CTTCTGCTGC TGCTACTGGG GGGCGCCCAC GGCCTCTTTC 



60 
120 



258 



WO 02/079492 



CTGAGGACCC GCCGCCGCTT AGCGTGGCCC CCAGGGACTA CCTGAACCAC TATCCCGTGT 180 

TTGTGGGCAG CGGGCCCGGA CGCCTGACCC CC6CAGAA66 TGCTGAOGAC CTCAACATCC 240 

AGCGAGTCCT GCGGGTCAAC AGGACGCTGT TCATTGGGGA CAGGGACAAC CTCTACCGCG 300 

TAGAGCTGGA GCCCCCCACG TCCACGGAGC TGCGGTACCA GAGGAAGCTG ACCTGGAGAT 360 

CTAACCCCAG CGACATAAAC GTGTGTCGGA TGAAGGGCAA ACAGGAGGGC GAGTGTCGAA 420 

ACTTCGTAAA GGTGCTGCTC CTTCGGGACG AGTCCACGCT CTTTGTGTGC GGTTCCAACG 480 

CCTTCAACCC GGTGTGCGCC AACTACAGCA TAGACACCCT GCAGCCCGTC GGAGACAACA 540 

TCAGCGGTAT GGCCCGCTGC CCGTACGACC CCAAGCACGC CAATGTTGCC CTCTTCTCTG 600 

ACGGGATGCT CTTCACAGCT ACTGTTACCG ACTTCCTAGC CATTGATGCT GTCATCTACC 660 

GCAGCCTCGG GGACAGGCCC ACCCTGCGCA CCGTGAAACA TGACTCCAAG TGGTTCAAAG 720 

AGCCTTACTT TGTCCATGCG GTGGAGTGGG GCAGCCATGT CTACTTCTTC TTCCGGGAGA 780 

TTGCGATGGA GTTTAACTAC CTGGAGAAGG TGGTGGTGTC CCGCGTGGCC CGAGTGTGCA 840 

AGAACGACGT GGGAGGCTCC CCCCGCGTGC TGGAGAAGCA GTGGACGTCC TTCCTGAAGG 900 

CGCGGCTCAA CTGCTCTGTA CCCGGAGACT CCCATTTCTA CTTCAACGTG CTGCAGGCTG 960 

TCACGGGCGT GGTCAGCCTC GGGGGCCGGC CCGTGGTCCT GGCCGTTTTT TCCACGCCCA 1020 

GCAACAGCAT CCCTGGCTCG GCTGTCTGCG CCTTTGACCT GACACAGGTG GCAGCTGTGT 1080 

TTGAAGGCCG CTTCCGAGAG CAGAAGTCCC CCGAGTCCAT CTGGACGCCG GTGCCGGAGG 1140 

ATCAGGTGCC TCGACCCCGG CCCGGGTGCT GCGCAGCCCC CGGGATGCAG TACAATGCCT 1200 

CCAGCGCCTT GCCGGATGAC ATCCTCAACT TTGTCAAGAC CCACCCTCTG ATGGACGAGG 1260 

CGGTGCCCTC GCTGGGCCAT GCGCCCTGGA TCCTGCGGAC CCTGATGAGG CACCAGCTGA 1320 

CTCGAGTGGC TGTGGACGTG GGAGCCGGCC CCTGGGGCAA CCAGACCGTT GTCTTCCTGG 1380 

GTTCTGAGGC GGGGACGGTC CTCAAGTTCC TCGTCCGGCC CAATGCCAGC ACCTCAGGGA 1440 

CGTCTGGGCT CAGTGTCTTC CTGGAGGAGT TTGAGACCTA CCGGCCGGAC AGGTGTGGAC 1500 

GGCCCGGCGG TGGCGAGACA GGGCAGCGGC TGCTGAGCTT GGAGCTGGAC GCAGCTTCGG 1560 

GGGGCCTGCT GGCTGCCTTC CCCCGCTGCG TGGTCCGAGT GCCTGTGGCT CGCTGCCAGC 1620 

AGTACTCGGG GTGTATGAAG AACTGTATCG GCAGTCAGGA CCCCTACTGC GGGTGGGCCC 1680 

CCGACGGCTC CTGCATCTTC CTCAGCCCGG GCACCAGAGC CGCCTTTGAG CAGGACGTGT 1740 

CCGGGGCCAG CACCTCAGGC TTAGGGGACT GCACAGGACT CCTGCGGGCC AGCCTCTCCG 1800 

AGGACCGCGC GGGGCTGGTG TCGGTGAACC TGCTGGTAAC GTCGTCGGTG GCGGCCTTCG 1860 

TGGTGGGAGC CGTGGTGTCC GGCTTCAGCG TGGGCTGGTT CGTGGGCCTC CGTGAGCGGC 1920 

GGGAGCTGGC CCGGCGCAAG GACAAGGAGG CCATCCTGGC GCACGGGGCG GGCGAGGCGG 1980 

TGCTGAGCGT CAGCCGCCTG GGCGAGCGCA GGGOGCAGGG TCCOGGGGGC CGGGGCGGAG 2040 

GCGGTGGCGG TGGCGCCGGG GTTCCCCCGG AGGCCCTGCT GGCGCCCCTG ATGCAGAACG 2100 

GCTGGGCCAA GGCCACGCTG CTGCAGGGCG GGCCCCACGA CCTGGACTCG GGGCTGCTGC 2160 

CCACGCCCGA GCAGACGCCG CTGCCGCAGA AGCGCCTGCC CACTCCGCAC CCGCACCCCC 2220 

ACGCCCTGGG CCCCCGCGCC TGGGACCACG GCCACCCCCT GCTCCCGGCC TCCGCTTCAT 2280 

CCTCCCTCCT GCTGCTGGCG CCOGCCCGGG CCCCCGAGCA GCCCCCCGCG CCTGGGGAGC 2340 

CGACCCCCGA CGGCCGCCTC TATGCTGCCC GGCCCGGCCG CGCCTCCCAC GGCGACTTCC 2400 

CGCTCACCCC CCACGCCAGC CCGGACCGCC GGCGGGTGGT GTCCGCGCCC ACGGGCCCCT 2460 

TGGACCCAGC CTCAGCCGCC GATGGCCTCC CGCGGCCCTG GAGCCCGCCC CCGACGGGCA 2520 

GCCTGAGGAG GCCACTGGGC CCCCACGCCC CTCCGGCCGC CACCCTGCGC CGCACCCACA 2580 

CGTTCAACAG CGGCGAGGCC CGGCCTGGGG ACCGCCACCG CGGCTGCCAC GCCCGGCCGG 2640 

GCACAGACTT GGCCCACCTC CTCCCCTATG GGGGGGCGGA CAGGACTGCG CCCCCCGTGC 2700 

CCTAGGCCGG GGGCCCCCCG ATGCCTTGGC AGTGCCAGCC ACGGGAACCA GGAGCGAGAG 2760 

ACGGTGCCAG AACGCCGGGG CCCGGGGCAA CTCCGAGTGG GTGCTCAAGT CCCCCCCGCG 2820 

ACCCACCCGC GGAGTGGGGG GCCCCCTCCG CCACAAGGAA GCACAACCAG CTCX3CCCTCC 2880 

CCCTACCCGG GGCCGCAGGA CGCTGAGACG GTTTGGGGGT GGGTGGGCGG GAGGACTTTG 2940 

CTATGGATTT GAGGTTGACC TTATGCGCGT AGGTTTTGGT TTTTTTTGCA GTTTTGGTTT 3000 

CTTTTGOGGT TTTCTAACCA ATTGCACAAC TCCGTTCTCG GGGTGGCGGC AGGCAGGGGA 3060 

GGCTTGGACG CCGGTGGGGA ATGGGGGGCC ACAGCTGCAG ACCTAAGCCC TCCCCCACCC 3120 

CTGGAAAGGT CCCTCCCCAA CCCAGGCCCC TGGCGTGTGT GGGTGTGCGT GCGTGTGCGT 3180 

GCCGTGTTCG TGTGCAAGGG GCCGGGGAGG TGGGCGTGTG TGTGCGTGCC AGCGAAGGCT 3240 

GCTGTGGGCG TGTGTGTCAA GTGGGCCACG CGTGCAGGGT GTGTGTCCAC GAGCGACGAT 3300 

CGTGGTGGCC CCAGCGGCCT GGGCGTTGGC TGAGCCGACG CTGGGGCTTC CAGAAGGCCC 3360 

GGGGGTCTCC GAGGTGCCGG TTAGGAGTTT GAACCCCCCC CACTCTGCAG AGGGAAGCGG 3420 

GGACAATGCC GGGGTTTCAG GCAGGAGACA CGAGGAGGGC CTGCCOGGAA GTCACATCGG 3480 
CAGCAGCTGT CTAAAGGGCT TGGGGGCCTG GGGGGCGGCG AAAG 



Seq ID No: 167 Protein sequence: 
Protein Accession #: NP_115484.1 

1 11 21 31 41 51 

I I I I I I 

MQTPRASPPR PATiTiTiTtTiTiTJt GGAHGLFPED PPPLSVAPRD YLNHYPVFVG SGPGRLTPAE 60 

GADDLNTQRV LRVNRTLPIG DRDNLYRVEL BPPTSTELRY QRKLTWRSNP SDINVCRMKG 120 

KQEGECRNPV KVTiLLRDEST LFVCGSHAFN PVCANYSIDT LQPVGDNISG MARCPYDPKH 180 

ANVALFSDGM LPTATVTDPL AIDAVTYRSL GDRPTLRTVK HDSKWFKEPY FVHAVEWGSH 240 

VYPPPREIAM BPNYLEKVW SRVARVCKND VGGSPRVLEK QWTSPLKARL NCSVPGDSHP 300 

YFNVLQAVTG WSLGGRPW LAVFSTPSNS IPGSAVCAFD LTQVAAVFEG RFREQKSPES 360 

IWTPVPEDQV PRPRPGCCAA PGMQYNASSA LPDDILNFVK THPLMDKAVP SIjGHAPWILR 420 

TLMRHQLTRV AVDVGAGPWG NQTWFIiGSE AGTVLKFLVR PNASTSGTSG LSVFLEEPET 480 

YRPDRCGRPG GGBTGQRLLS LELDAASGGL LAAFPRCWR VPVARCQQYS GCMKNCIGSQ 540 

DPYCGWAPDG SCIFLSPGTR AAFEQDVSGA STSGLGDCTG LLRASLSEDR AGLVSVNLLV 600 

TSSVAAFWG AWSGFSVGW FVGLRERREL ARJRKDKEAIL AHGAGEAVLS VSRLGBRRAQ 660 

GPGGRGGGGG GGAGVPPEAL LAPLMQNGWA KATLIiQGGPH DLDSGIiLPTP EQTPLPQKRL 720 



259 



WO 02/079492 



PCT/US02/04915 



PT PHP HP HAL GPRAWDHGHP LLPASASSSL LLLAPARAPB QPPAPGEPTP DGRLYAARPG 780 
RASHGDFPLT PHASPDRRRV VSAPTGPLDP ASAADGLPRP WSPPPTGSLR RPLGPHAPPA 840 
ATLRRTHTFN SGEARPGDRH RGCHARPGTD LAHLLPYGGA DRTAPPVP 



Seq ID NO: 168 DNA sequence 
Nucleic Acid Accession fh AW205664 

Coding sequence: 1-135 (underlined sequences correspond to start and stop codons) 



1 
I 

CGGCACGAGG 
CTAATTTGGA 
ACCCTCTCCG 
CTTGGAAGGC 
GTAAGCCTGA 
TGCATGGCGT 
TAAAACAGCT 



11 
I 

AGAACAGGGG 
GCACAGTCTT 
TTTAGTACCT 
GGAATGTGTT 
GTGGATCCTG 
TATGTAGATC 
GCCCTGGATG 



21 

I 

CCTCTGCCTC 
CCCGGTGCCT 
GACCACCTGT 
TTCGTGTCTT 
ACTCAGCTGC 
ACGTGCGGCA 
AAACGGAATA 



31 
I 

AGTTTGCCCG 
AGACATGCCA 
TTCAAAACGC 
CTAGGAAGGG 
AGCCCTTACC 
GAGACAGCCA 
AACCAGTGAT 



41 
I 

GGAGCCAGCC 
AGGCCCCTCC 
AGGTGTTTCT 
TCTGCTGAGG 
TGCCTCGTGC 
CTGTCCTGTG 
GCTAAAAAAA 



SI 
I 

AGGGCCCATC 60 

CACGTGGTAC 120 

GGTTTAGAAA 180 

ACCAGACCAC 240 

TGATGATCTA 300 

TGOGGGTTTT 360 
AAAAAAAAAA 



Seq ID No: 169 Protein sequence: 
Protein Accession #: AW205664 



1 11 • 21 31 41 

I t I I \ 

RHEBNRGLCL SLPGSQPGPI LIWSTVFPVP RHAKAPPTWY TLSV 



51 



Seq ID NO: 170 DNA sequence 
Nucleic Acid Accession #: AB033100 

Coding sequence: 32-2623 (underlined sequences correspond to start and stop codons) 



1 
I 

AGGTCTGGGG 
GACGGTCTCG 
GCACTCCGTC 
CATCATCCCC 
GATCCATGAT 
TGAGCACTAC 
GGATGTCACT 
CCGGCAGGTG 
CAGGCGGGTC 
GCGGGAGGAA 
AGACAAGCAG 
CCTGGAGCTG 
CCATGTGTAC 
TGAGGACGAC 
CTACAGGTAC 
CGCCTTTGTC 
GCCTCCCCCA 
GGTCCTGGGC 
CCCCACGCAG 
CATGGTGCCC 
CGAGTTGCAT 
ACCGGAGAGC 
GAGCCTGGAG 
GCTGGCCTTT 
GCCCGTGACG 
CCTACGGGAG 
GGCCAACTTC 
GGCCCTGGGG 
CT6GGTGAGC 
GTGGCCTGGG 
CCATCTAAGC 
CCTTACCATG 
CCGCATCCCC 
GGCCCTGCGG 
CGGCCAGGGC 
AGGCTTCCCC 
GGGTGAATTT 
GAAGGAGGTG 
CCTGOGGGAG 
AATGCGGAGG 



11 
I 

TCCTGAGGCT 
GCAGGCACCC 
AGCATCCACT 
AACAAGGTGG 
GAGCTGCTCA 
CTGGTGCAAG 
GAGAAGATGG 
CAGGGTGGOC 
CTCCAGAAAC 
MCTGTGCTTT 
AACCTTCATG 
GCCATCCGGA 
CATAACACCG 
TTGCATGTGA 
CACCGCCTGC 
AGTGTTCTCC 
GCCCTCGTCT 
ACCCTCATCC 
GCCAAGCCCC 
CAGGGAAGGA 
GACCTGAAAG 
CCAGCCCAGG 
OGATACTTCT 
GCCCTCAGTT 
CTGAGCTCAG 
GACGATCTGG 
CGGCGGGTGC 
AGCATCCTGG 
CTTCGGGAGG 
CCCCCTGTGG 
GAGCCTCCCC 
CAGGAGGTCT 
ATGCCGGACT 
GCCGCCCTCT 
CGTACCACAA 
GAGGTGGGTG 
CAGGTAGTAA 
GACGCAGCGC 
ATCATCATCT 
CTGCAGCTGC 



21 
I 

GCTGGCAGAC 
CATTTGAGGG 
CCTTCCAGAG 

AGGCTCATTA 
GAGCTCAGGC 
ATGTGCTGGG 
TCACTGTGTT 
TCCAGAAGGA 
TCCTGCGTGC 
AGAACCTCCA 
AAGAGATCCA 
AGGACCTGTG 
CGGAGGAGGT 
CCCTGCCCGA 
GGGAGACCCC 
TCAGCTGCCA 
TGCTTCACCG 
TGCCTATGGA 
GGATGGTGGA 
AAGTGGTCTT 
GAAGCGGCAG 
ACCTGATCCT 
TCAGCCGCTG 
CAGGCCCTGT 
TCTCCCCGGA 
CCCGCATGCC 
CCTACCTGAC 
AGGCCGTGTT 
CTCCTGACCA 
CAGGCAAGGA 
TCAGCCAGCA 
TCTGTGCCCC 
CCAAGGACCC 
CTGCGATGGT 
AGGAGGAGCT 
TGAAGGTGGT 
TGGACACTGT 
GCACCTACCG 
GGAGCCTGCA 



31 
I 

TAyGGGTACA 
CCTACAGGGC 
CACTAGCTTG 
GATCACGTAC 
CACGTTGGGC 
CTTACCCCAG 
CACCGTGGGA 
CGGCATGGGA 
CGGACATAGG 
AGATGAGGAC 
GGGCCTTGGA 
CGACTTTGCC 
GGGGGAGCCC 
GTACAAGCGG 
GCAAGGGAGT 
CAGCCTGCTG 
GATGGGCGTG 
CAGTGGGACC 
GCAGTTCCAG 
AGAGGTGGAC 
GGAAAACCAG 
CCGACACAGC 
GTTTAACTAC 
GCTGTGTGCC 
GGCTCCGAGG 
OGCGCTCAGC 
CATCTACGGC 
GGACGCCAAG 
GGAGTGTGAC 
GCTGGAGACC 
GGGCCCCCTG 
CCGCAGGGCC 
CCGAGAGGAG 
AGGCACTGGC 
6GTGGCTGTC 
CGTGAGTGTG 
GCAGCTGCTA 
CAGCGAGACC 
CCAGGCGAAG 
GTACTTGGAG 



41 
I 

ACGGCCAGCA 
AGTGGCACGA 
CATAACAGCA 
AACTGCAAGG 
CGGCTCTCGG 
GGCCGCTACT 
AGCTGTGGGG 
CAGCCCAGCC 
GAGTGTGTCA 
TTTGTGTCCT 
CCCGGGGTCC 
CAGCTGAGCG 
CATGCTGTGG 
CTCTTCTTCC 
CCCCTGGAGG 
CAGCTCCGTG 
GGCAGGACCA 
ACCTCCCAGC 
GTGATCCAGA 
AGAGCCATCA 
AAGAAGTTAG 
GTCTGGCAGA 
TACCTTCATG 
CACCCTGAGC 
GACCTCATCG 
ACTGTCAGAG 
ACGGCCCAGC 
AGGAGGCTGC 
GGGCACACCT 
CTGGAGGCCC 
ACCTACAGGT 
TGTCCTGGCC 
GACTTTGACC 
TTCGTGTTCA 
CTGGCCTTCT 
CCTGATGCCA 
CCCGATGGGC 
ATGACGCCCA 
GCAGCGAAAG 
CGCTATGTCT 



51 
I 

CAGCCCAGCA 
TGGACAGTCG 
AGGCCAAGTC 
AGGAGTTCCA 
ACAACACCCC 
TCCTGGTGCG 
CCCCCAACTT 
TCTTAGGGTT 
TCTTCTGTGT 
ACACACCTCG 
GGGTGGAGAG 
AGAACACATA 
CCATCCATCG 
TGCAGCCCAC 
CCCAGTTGGA 
ATGCCCACGG 
ACCTGGGCAT 
CAGAGGCTGC 
GCTTTCTCCG 
CTGCCTGTGC 
AAGGTATCCG 
GGGCGCTGTG 
AGCAGTACCC 
TGTACCGCCT 
CCAGGGGCTC 
AGATGGATGT 
CCAGCGCCAA 
GGAAGGTTGT 
ACAGCCTGOG 
AGCTGAAGGC 
TCCAGACCTG 
TCACCTACCA 
AGCTGCTGGA 
GCTGCCTCAG 
GGCACATCCA 
AGTTCACTAA 
ACCGTGTGAA 
TGCACTACCA 
AGGCGCAGGA 
GCCTGATTCT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
I860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 



260 



WO 02/079492 



PCT/US02/04915 



CTTCAACGCG TACCTCCACC TGGAGAAGGC 
GATGCAGGAG GTGGCATCGA AGGCTGGCAT 
CGAGCTGGAG AGCGGGGAGG ACCAGCCCTT 
GAGCTGCAGC CTOGAGCCCT CTGCCCCOGA 
CCCCCCACCC ACAGGGCCCC ACGCAGGCCT 
GGCCCTGAGG GGTGCTGGCC TTGAAATGAT 
TTGGGAGCCT TTTTAGAAAG AACTTTTTAT 
CAAACCACCA AGGTGTGTGG CTGACCTCCA 
GTGCACACTG CTGTGTGTAC CTTGCAGACA 
CCCCCAGTTG CCAAACACTG TGGATCTCTC 
TGGCAGCCCC TGGCACAGAG CAGACCCGGC 
CTGCTCTGCC ATTGCCGCTC CCCTTCTTGC 
GCCTGAGGTG GGTGGAGGGG ACAGTGTTCT 
ACCCAGTTTT CTGGACTCTC ATGCCCCCAT 
CCTACCCAGC CTGGTGGGGC TGGCAGGATG 
AGGGAGCCCC TCTCATGGGG AGGAAAGAGC 
AGGCCTGCTC CACTTGTCTG GGAACCTGGG 
GCTGCAGGTC CCCCGGCATC TCTCTCTGTC 
CCTGCTGCAG CAGGAGCCCC AAGGAGTGCT 
TGGACAGTGA GGTGTGCAAG GGTGCACTGA 
GGCCATCCTT GCTGAGCATC TTTGAGCCTG 
CTGCTGAGTT AGAGGCTGCT GGGATCCACT 
CAGGTGGCAG AGAAGTGCCA TGTTTGCGTT 
GTGCTTGCTG AAACCCAGGA GCTGAACAGT 
GGGACCAGSA AAGCCTGTCT TTGGTTAGGC 
GATGTGTCAT TGGTCATGAT ATTTGAAAAG 
CAGTATTGGA AAATATTTGA CCCCCTTGGC 
GTTCACTACC TTTTCAGGTT TATTGTTTTT 
CTTTGCAGAC AAGGTCTAGA TGCGGAGTCA 
GTGTTCTCAT GGTTGGCTCT GACTTTCAGC 
CTCTCTGCCT CAGTTTCCCC ATCTGTAAAA 
CRGGGGTGTT GTGAGGATTC ATTTGTGATT 
GCATTAAAAA CAGCTAAATG TG 



CGACTCCTGG CAGAGGCCCT TCAGCACCTG 2460 

CTACGAGATC CTTAACGAGC TGGGCTTCCC 2520 

CTCCAGGCTG CGCTACOGGT GGCAGGAGCA 2580 

GGACTTGCTG TAGGGGGCCT TACTCCCTGT 2640 

GGGGTGTCTG AGGTGCTCTT GGCTGGGAGC 2700 

TCCCCCACTT CCTGGAGAGA CTGAGCGGAG 2760 

AGGACAGGGA GACAGCACAG CCATCCCTTG 2820 

GGGAGGAGCA CTCACTGGAG TGCTCACAAG 2880 

GGCCGGCGTT CAGCCTCCAA GGGGCTCACT 2940 

TGTCCTCTTC TCCCCTCTCT CAGATTGGCC 3000 

CACTGGTAGC TCCCCACTTC CTTACTCCTG 3060 

TGCCCAAGCA CTGCCCTCGG GCGTCTGGCA 3120 

GGATAGATCT ATTATGTGAA AGGCAGCTTC 3180 

CTCCGACCTG GGAGACTTCA GGAATGACAA 3240 

GTGGAGGTTT CTCAAGGAGC TGGAGACTTC 3300 

TTCCAGGGGG CGAACGCAGC ACAGAGGAAG 3360 

CAGGAGGCAC AGAGGAAGCC AAGGCCTGGA 3420 

CCGGCAGCCC AGGATGGCCT GGTGCCCCCA 3480 

AGCTGAGGGT GGTTGCTGGG GTGGTCCTCA 3540 

GGGTGGTGGG AGGGGATCAC CTGGGTTCCA 3600 

CCTTCCGGTG GGAGCAGAAA AGGCCAGACC 3660 

GTTTCCACAC AGCGGGAAGG CTGCTGGGAA 3720 

GAGCCTTGCA GCTCTTCCAG CTGGGGACTG 3780 

GAGGAGGCTG TCCACCTTGC TTGGCTCACT 3840 

TCGTGTACTT CTGCAGGAAA AAAAAAAAAG 3900 

GGGAGGAGGC CGAAGTTGTT CCCATTTATC 3960 

TGAATTCTTT TGCAGAACTA CTGTGTGTCT 4020 

ATTTTTGCAT GAATTAAGAC GTTTTAATTT 4080 

GAGATGGGAC TGAATGGGGA GGGATCCTTT 4140 

TGTGTTGGGA CCACTGGCTG ATCACATCAC 4200 

TGGGAGAATA ATACTTGCCT ACCTACCTCA 4260 

TTTTTTTTTT TTTTTGTACA GAGCTTTTAA 4320 



Seq ID No: 171 Protein sequence: 
Protein Accession #: BAA8658B.1 



1 11 21 31 41 51 

I I I I I I 

MGTTASTAQQ TVSAGTPFEG LQGSGTMDSR HSVSIHSFQS TSLHNSKAKS IIPNKVAPW 60 

ITYNCKBBFQ IHDELLKAHY TLGRLSDNTP EHYLVQGAQA LPQGRYFLVR DVTEKMDVLG 120 

TVGSCGAPNF RQVQGGLTVF GMGQPSLLGF RRVLQKLQKD GHRECVIPCV REEVLFLRAD 180 

EDFVSYTPRD KQNLHENLQG LGPGVRVESL ELAIRKEIHD FAQLSENTYH VYHNTEDLWG 240 

BPHAVAIHGE DDLHVTEEVY KRPLFLQPTY RYHRLPLPEQ GSPXjEAQLDA FVSVLRETPS 300 

liLQLRDAHGP PPAItVFSCQM GVGRTNLGMV LGTLILLKRS GTTSQPEAAP TOAKPLPMEQ 360 

FQVTQSFLRM VPQGRRMVEE VDRAITACAE LHDLKEWLB NQKKLEGIRP ESPAQGSGSR 420 

HSVWQRALWS LERYFYLILF NYYLHEQYPL AFALSFSRWL CAHPELYRLP VTLSSAGPVA 480 

PRDLIARGSL REDD LVS PDA LSTVREMDVA NFRRVPRMPI YGTAQPSAKA LGSILAYLTD 540 

AKRRLRKWW VSLREEAVLE CDGHTYSLRW PGPPVAPDQL ETLEAQLKAH LSEPPPGKEG 600 

PLTYRFQTCL TMQEVFSQHR RACPGLTYHR IPMPDFCAPR EEDFDQLLEA LRAALSKDPG 660 

TGFVFSCLSG QGRTTTAMW AVLAFWHIQG FPEVGBEELV SVPDAKFTKG EFQWMKWQ 720 

LLPDGHRVKK EVDAALDTVS ETMTPMHYHL REIIICTYRQ AKAAKEAQEM RRLQLRSLQY 780 

LERYVCLXIiF NAYLHLEKAD SWQRPFSTWM QEVASKAGIY EILNELGFPE LESGEDQPFS B40 
RLRYRWQEQS CSLEPSAPED LL 

Seq ID NO: 172 DNA sequence 

Nucleic Acid Accession #: AK021806.1 

Coding sequence: 1-645 (underlined sequences correspond to start and stop codons) 



1 11 21 



ACTGTGCTTT 




AACCTTCATG AGAACCTCCA GGGCCTTGGA 
GCCATCCGGA AAGAGATCGA CGACTTTGCC 
CATAACACCG AGGACCTGTG GGGGGAGCCC 
TTGCATGTGA CGGAGGAGGT GTACAAGCGG 
CACCGCCTGC CCCTGCCCGA GCAAGGGAGT 
AGTGTTCTCC GGGAGACCCC CAGCCTGCTG 
GCCCTCGTCT TCAGCTGCCA GATGGGCGTG 
ACCCTCATCC TGCTTCACCX3 CAGTGGGACC 
GCCAAGCCCC TGCCTATGGA GCAGTTCCAG 
CAGGGAAGGA GGATGGTGGA AGAGGTGGAT 
AGTTTTCTGG ACTCTCATGC CCCCATCTCC 
CCCAGCCTGG TGGGGCTGGC AGGATGGTGG 
AGCCCCTCTC ATGGGGAGGA AAGAGCTTCC 



31 41 51 

I I I 

TTTGTGTCCT ACACACCTCG AGACAAGCAG 60 

CCCGGGGTCC GGGTGGAGAG CCTGGAGCTG 120 

CAGCTGAGCG AGAACACATA CCATGTGTAC 180 

CATGCTGTGG CCATCCATGG TGAGGACGAC 240 

CCCCTCTTCC TGCAGCCCAC CTACAGGTAC 300 

CCCCTGGAGG CCCAGTTGGA CGCCTTTGTC 360 

CAGCTCCGTG ATGCCCACGG GCCTCCCCCA 420 

GGCAGGACCA ACCTGGGCAT GGTCCTGGGC 480 

ACCTCCCAGC C3«SAGGCTGC CCCCACGCAG 540 

GTGATCCAGA GCTTTCTCCG CATGGTGCCC 600 

AGATCTATTA TGTGAAAGGC AGCTTCACCC 660 

GACCTGGGAG ACTTCAGGAA TGACAACCTA 720 

AGGTTTCTCA AGGAGCTGGA GACTTCAGGG 780 

AGGGGGCGAA CGCAGCACAG AGGAAGAGGC 840 



261 



WO 02/079492 



PCT7US02/04915 



CTGCTCCACT TGTCTGGGAA CCTGGGCAGG 
CAGGTCCCCC GGCATCTCTC TCTGTCCCGG 
CTGCAGCAGG AGCCCCAAGG AGTGCTAGCT 
CAGTGAGGTG TGCAAGGGTG CACTGAGGGT 
ATCCTTGCTG AGCATCTTTG AGCCTGCCTT 
TGAGTTAGAG GCTGCTGGGA TCCACTGTTT 
TGGCAGAGAA GTGCCATGTT TGCGTTGAGC 
TTGCTGAAAC CCAGGAGCTG AACAGTGAGG 
CCAGGAAAGC CTGTCTTTGG TTAGGCTCGT 
TGTCATTGGT CATGATATTT GAAAAGGGGA 
ATTGGAAAAT ATTTGACCCC CTTGGCTGAA 
ACTACCTTTT CAGGTTTATT GTTTTTATTT 
GCAGACAAGG TCTAGATGCG GAGTCAGAGA 
TCTCATGGTT GGCTCTGACT TTCAGCTGTG 
CTGCCTCAGT TTCCCCATCT GTAAAATGGG 
GGTGTTGTGA GGATTCATTT GTGATTTTTT 
TAAAAACAGC TAAATGTG 



AGGCACAGAG GAAGCCAAGG CCTGGAGCTG 900 

CAGCCCAGGA TGGCCTGGTG CCCCCACCTG 960 

GAGGGTGGTT GCTGGGGTGG TCCTCATGGA 1020 

GGTGGGAGGG GATCACCTGG GTTCCAGGCC 1080 

CCGGTGGGAG CAGAAAAGGC CAGACCCTGC 1140 

CCACACAGCG GGAAGGCTGC TGGGAACAGG 1200 

CTTGCAGCTC TTCCAGCTGG GGACTGGTGC 1260 

AGGCTGTCCA CCTTGCTTGG CTCACTGGGA 1320 

GTACTTCTGC AGGAAAAAAA AAAAAGGATG 1380 

GGAGGCCGAA GTTGTTCCCA TTTATCCAGT 1440 

TTCTTTTGCA GAACTACTGT GTGTCTGTTC 1500 

TTGCATGAAT TAAGACGTTT TAATTTCTTT 1560 

TGGGACTGAA TGGGGAGGGA TCCTTTGTGT 1620 

TTGGGACCAC TGGCTGATCA CATCACCTCT 1680 

AGAATAATAC TTGCCTACCT ACCTCACGGG 1740 

TTTTTTTTTT TGTACAGAGC TTTTAAGCAT 1800 



Seq ID No: 173 Protein sequence : 
Protein Accession #: AK021806.1 



1 11 21 31 41 SI 

I I I I 1 I 

TVLPLRADED FVSYTPRDKQ NLHENLQGLG PGVRVESLEL AIRKEIHDFA QLSENTYHVY 60 
HNTEDLWGBP HAVAIHGEDD LHVTBEVYKR PLFLQPTYRY HRLPLPEQGS PLEAQLDAFV 120 
SVLRETPSLL QLRDAHGPPP ALVFSCQMGV GRTNLGMVLG TLILLHRSGT TSQPEAAPTQ 180 
AKPLPMEQFQ VIQSPLRMVP QGRRMVEEVD RSIM 



Seq ID- NO; 174 DMA sequence 

Nucleic Acid Accession #: NM_016580.2 

Coding sequence: 1212-4766 (underlined sequences correspond to start and stop codons) 



1 11 21 , 31 41 51 

I I I I I I 

GGGAAGCGGG AGGAGAGCCA CACGGTCAAG TTGCACAGGT TCTTGCAGCT TCTGGAATCA 60 

AGACCATGGG CACCCTCATA AGTCAGTGTG GGCAGGGACT GCCCCAGGGC CAATCCAAGA 120 

TCCAGAGGTA GCCATAGGGT GTGACAAGTT GTGCAGATTA CAACACTCAC CCCTTGCAAT 180 

AACGTCACTG CCTGTGACTC GGGGCCAGGC CCAGGCCAAA GOCCTTCCTA CATCATTTCG 240 

TTTAATCCTC ACAGTTTCCT GCTGAAAGGG CTACTATTCT TACTCCCATC CCCACTCTAC 300 

AGATGAGGTA ATGGAGGCCC AGGAAAGTTA AGTGACTTGT CCCAGATGAC ACCGCTGGTA 360 

AGTTGCAAAG TCAGAATTTG AACTCAGGCA GTTTACCTCT GATGGCTGCT CTGTTAATCA 420 

CAGCTGCTTT CCAGTGAGAC AAAAACGGGT GATCAGGGCA GAGTCAAGAC AGAGAGGTAA 480 

ACAAGATTGG GAAAAAGACA GGAATGAGAG GGGAACAATG GGGGAAAAGA TAGGAACAAA 540 

GAGAGTTGGG GAAGGGGAGA GAAACAGGAA ACATGACTTG CCCGGGAGGG GCATCAGTCC 600 

ACGTGCAAGC AGGTGGAGGC TCAAGTTTTC TGCTCACTTG GTGATGCAGA GGCTCCCTTT 660 

CCCTCAGCAG CCGCCTTCCT GCGTGGACAG CAGCTTCCCA TCTGGCCTGT CCCCGGAGCC 720 

CCGGCCTCAT CCTCCTCAGC GGCAGGCCAC TTAGCTTCAC AGGAAATGCT CTTTCTCTAA 780 

TTGGCATTGA AACTCACAGC CCTCCCTTTT CCTGTAGGTG GGGTTTCCAT AGGAAAAAGC 840 

TGCTTCTCTG TTTCCCCAGC CTAGCAACTG TTTGGCAGTC AGAGTCCCAC ATCCTGCTCA 900 

ACTGGGTCAG GTCCCTCTTA GACCAGCTCT TGTCCATCAT TTGCTGAAGT GGACCAACTA 960 

GTTCCCCAGT AGGGGGTCTC CCCTGGCAAT TCTTGATCGG CGTTTGGACA TCTCAGATCG 1020 

CTTCCAATGA AGATGGCCTT GCCTTGGGGT CCTGC1TOTT TCATAATCAT CTAACTATGG 1080 

GACAAGGTTG TGCCGGCAGC TCTGGGGGAA GGAGCACGGG GCTGATCAAG CCATCCAGGA 1140 

AACACTGGAG GACTTGTCCA GCCTTGAAAG AACTCTAGTG GTTTCTGAAT CTAGCCCACT 1200 

TGGCGGTAAG CATGATGCAA CTTCTGCAAC TTCTGCTGGG GCTTTTGGGG CCAGGTGGCT 1260 

ACTTATTTCT TTTAGGGGAT TGTCAGGAGG TGACCACTCT CACGGTGAAA TACCAAGTGT 1320 

CAGAGGAAGT GCCATCTGGT ACAGTGATCG GGAAGCTGTC CCAGGAACTG GGCCGGGAGG 1380 

AGAGGCGGAG GCAAGCTGGG GCTGCCTTCC AGGTGTTGCA GCTGCCTCAG GCGCTCCCCA 1440 

TTCAGGTGGA CTCTGAGGAA GGCTTGCTCA GCACAGGCAG GCGGCTGGAT CGAGAGCAGC 1500 

TGTGCCGACA GTGGGATCCC TGCCTGGTTT CCTTTGATGT GCTTGCCACA GGGGATTTGG 1560 

CTCTGATCCA TGTGGAGATC CAAGTGCTGG' ACATCAATGA CCACCAGCCA CGGTTTCCCA 1620 

AAGGCGAGCA GGAGCTGGAA ATCTCTGAGA GCGCCTCTCT GCGAACCCGG ATCCCCCTGG 1680 

ACAGAGCTCT TGACCCAGAC ACAGGCCCTA ACACCCTGCA CACCTACACT CTGTCTCCCA 1740 

GTGAGCACTT TGCCTTGGAT GTCATTGTGG GCCCTGATGA GACCAAACAT GCAGAACTCA 1800 

TAGTGGTGAA GGAGCTGGAC AGGGAAATCC ATTCATTTTT TGATCTGGTG TTAACTGCCT 1860 

ATGACAATGG GAACCCCCCC AAGTCAGGTA CCAGCTTGGT CAAGGTCAAC GTCTTGGACT 1920 

CCAATGACAA TAGCCCTGCG TTTGCTGAGA GTTCACTGGC ACTGGAAATC CAAGAAGATG 1980 

CTGCACCTGG TACGCTTCTC ATAAAACTGA CCGCCACAGA CCCTGACCAA GGCCCCAATG 2040 

GGGAGGTGGA GTTCTTCCTC AGTAAGCACA TGCCTCCAGA GGTGCTGGAC ACCTTCAGTA 2100 

TTGATGCCAA GACAGGCCAG GTCATTCTGC GTCGACCTCT AGACTATGAA AAGAACCCTG 2160 

CCTACGAGGT GGATGTTCAG GCAAGGGACC TGGGTCCCAA TCCTATCCCA GCCCATTGCA 2220 
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AAGTTCTCAT CAAGGTTCTG GATGTCAATG ACAACATCCC AAGCATCCAC GTCACAT6GG 22 BO 

CCTCCCAGCC ATCACTGGTG TCAGAAGCTC TTCCCAAGGA CAGTTTTATT GCTCTTGTCA 2340 

TG6CAGATGA CTTGGATTCA GGACACAATG GTTTGGTCCA CTGCTGGCTG AGCCAAGAGC 2400 

TGGGCCACTT CAGGCTGAAA AGAACTAATG GCAACACATA CATGTTGCTA ACCAATGCCA 2460 

5 CACTGGACAG AGAGCAGTGG CCCAAATATA CCCTCACTCT GTTAGCCCAA GACCAAGGAC 2520 

TCCAGCCCTT ATCAGCCAAG AAACAGCTCA GCATTCAGAT CAGTGACATC AACGACAATG 2580 

CACCTGTGTT TGAGAAAAGC AGGTATGAAG TCTCCACGCG GGAAAACAAC TTACCCTCTC 2640 

TTCACCTCAT TACCATCAAG GCTCATGATG CAGACTTGGG CATTAATGGA AAAGTCTCAT 2700 

ACCGCATCCA GGACTCCCCA GTTGCTCACT TAGTAGCTAT TGACTCCAAC ACAGGAGAGG 2760 

10 TCACTGCTCA GAGGTCACTG AACTATGAAG AGATGGCCGG CTTTGAGTTC CAGGTGATCG 2820 

CAGAGGACAG CGGGCAACCC ATGCTTGCAT CCAGTGTCTC TGTGTGGGTC AGCCTCTTGG 2880 

ATGCCAATGA TAATGCCCCA GAGGTGGTCC AGCCTGTGCT CAGCGATGGA AAAGCCAGCC 2940 

TCTCCGTGCT TGTGAATGCC TCCACAGGCC ACCTGCTGGT GCCCATCGAG ACTCCCAATG 3000 

GCTTGGGCCC AGCGGGCACT GACACACCTC CACTGGCCAC TCACAGCTCC CGGCCATTCC 3060 

15 TTTTGACAAC CATTGTGGCA AGAGATGCAG ACTCGGGGGC AAATGGAGAG CCCCTCTACA 3120 

GCATCCGCAG TGGAAATGAA GCCCACCTCT TCATCCTCAA CCCTCATACG GGGCAGCTGT 3180 

TCGTCAATGT CACCAATGCC AGCAGCCTCA TTGGGAGTGA GTGGGAGCTG GAGATAGTAG 3240 

TAGAGGACCA GGGAAGCCCC CCCTTACAGA CCCGAGCCCT GTTGAGGGTC ATGTTTGTCA 3300 

CCAGTGTGGA CCACCTGAGG GACTCAGCCC GCAAGCCTGG GGCCTTGAGC ATGTCGATGC 3360 

20 TGACGGTGAT CTGCCTGGCT GTACTGTTGG GCATCTTCGG GTTGATCCTG G CTTTGTT CA 3420 

TGTCCATCTG CCGGACAGAA AAGAAGGACA ACAGGGCCTA CAACTGTCGG GAGGCGGAGT 3480 

CCACCTACCG CCAGCAGCCC AAGAGGCCCC AGAAACACAT TCAGAAGGCA GACATCCACC 3540 

TCGTGCCTGT GCTCAGGGGT CAGGCAGGTG AGCCTTGTGA AGTCGGGCAG TCCCACAAAG 3600 

ATGTGGACAA GGAGGCGATG ATGGAAGCAG GCTGGGACCC CTGCCTGCAG GCCCCCTTCC 3660 

25 ACCTCACCCC GACCCTGTAC AGGACGCTGC GTAATCAAGG CAACCAGGGA GCACCGGCGG 3720 

AGAGCCGAGA GGTGCTGCAA GACACGGTCA ACCTC CTTTT CAACCATCCC AGGCAGAGGA 3780 

ATGCCTCCCG GGAGAACCTG AACCTTCCCG AGCCCCAGCC TGCCACAGGC CAGCCACGTT 3840 

CCAGGCCTCT GAAGGTTGCA GGCAGCCCCA CAGGGAGGCT GGCTGGAGAC CAGGGCAGTG 3900 

AGGAAGCCCC ACAGAGGCCA CCAGCCTCCT CTGCAACCCT GAGACGGCAG CGACATCTCA 3960 

30 ATGGCAAAGT GTCCCCTGAG AAAGAATCAG GGCCCCGTCA GATCCTGCGG AGCCTGGTCC 4020 

GGCTGTCTGT GGCTGCCTTC GCCGAGCGGA ACCCCGTGGA GGAGCTCACT GTGGATTCTC 4080 

CTCCTGTTCA GCAAATCTCC CAGCTGCTGT CCTTGCTGCA TCAGGGCCAA TTCCAGCCCA 4140 

AACCAAACCA CCGAGGAAAT AAGTACTTGG CCAAGCCAGG AGGCAGCAGG AGTGCAATCC 4200 

CAGACACAGA TGGCCCAAGT GCAAGGGCTG GAGGCCAGAC AGACCCAGAA CAGGAGGAAG 4260 

35 GGCCTTTGGA TCCTGAAGAG GACCTCTCTG TGAAGCAACT GCTAGAAGAA GAGCTGTCAA 4320 

GTCTGCTGGA CCCCAGCACA GGTCTGGCCC TGGACCGGCT GAGCGCCCCT GACCCGGCCT 4380 

GGATGGCGAG ACTCTCTTTG CCCCTCACCA CCAACTACCG TGACAATGTG ATCTCCCCGG 4440 

ATGCTGCAGC CACGGAGGAG CCAAGGACCT TCCAGACGTT CGGCAAGGCA GAGGCACCAG 4500 

AGCTGAGCCC AACAGGCACG AGGCTGGCCA GCACCTTTGT CTCGGAGATG AGCTCACTGC 4560 

40 TGGAGATGCT GCTGGAACAG CGCTCCAGCA TGCCCGTGGA GGCCGCCTCC GAGGCGCTGC 4620 

GGCGGCTCTC GGTCTGCGGG AGGACCCTCA GTTTAGACTT GGCCACCAGT GCAGCCTCAG 4680 

GCATGAAAGT GCAAGGGGAC CCAGGTGGAA AGACGGGGAC TGAGGGCAAG AGCAGAGGCA 4740 

GCAGCAGCAG CAGCAGGTGC CTGTGAACAT ACCTCAGACG CCTCTGGATC CAAGAACCAG 4800 

GGGCCTGAGG ATCTGTGGAC AAGAGCTGGT TTCTAAAATC TTGTAACTCA CTAGCT AG CG 4860 

45 GCGGCCTGAG AACTTTAGGG TGACTGATGC TACCCCCACA GAGGAGGCAA GAGCCCCAGG 4920 

ACTAACAGCT GACTGACCAA AGCAGCCCCT TGTAAGCAGC TCTGAGTCTT TTGGAGGACA 4980 

GGGAOGGTTT GTGGCTGAGA TAAGTGTTTC CTGGCAAAAC ATATGTGGAG CACAAAGGGT 5040 

CAGTCCTCTG GCAGAACAGA TGCCACGGAG TATCACAGGC AGGAAAGGGT GGCCTTCTTG ' 5100 

GGTAGCAGGA GTCAGGGGGC TGTACCCTGG GGGTGCCAGG AAATGCTCTC TGACCTATCA 5160 

50 ATAAAGGAAA AGCAGTGATT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

Seq ID No: 175 Pyofrein sequence: 
Protein Accession #: NP_057664.1 

55 

1 11 21 31 41 51 

I I I I 1 I 

MMQLLQLLLG LLGPGGYLFL LGDCQEVTTL TVKYQVSEEV PSGTVIGKLS QELGREERRR 60 

QAGAAFQVLQ LPQALPIQVD SEEGLLSTGR RLDREQLCRQ WDPCLVSFDV LATGDIALIH 120 

60 VEIQVLDIND HQPRFPKGEQ ELEISESASL RTRIPLDRAL DPDTGPNTLH TYTLSPSEHF 180 

ALDVIVGPDB TKHAELIWK ELDREIHSFF DLVLTAYDNG NPPKSGTSLV KVNVLDSNDN 240 

SPAFAESSLA LEIQEDAAPG TLLIKLTATD PDQGPNGEVE FFLSKHMPPE VLDTFSIDAK 300 

TGQVTLRRPL DYEKHPAYEV DVQARDLGPN PIPAHCKVLI KVLDVNDNIP SIHVTWASQP 360 

SLVSEALPKD SFIALVMADD LDSGHNGLVH CWLSQELGHF RLKRTNGNTY MLLTNATLDR 420 

65 EQWPKYTLTL LAQDQGLQPL SAKKQLSIQI SDINDNAPVF EKSRYEVSTR ENNLPSLHLI 480 

TIKAHDADLG INGKVSYRIQ DSPVAHLVAI DSNTGEVTAQ RSLNYEEMAG FEFQVXAEDS 540 

GQPMIASSVS VWVSLLDAND NAPEWQPVL SDGKASLSVL VNASTGHLLV PIETPNGLGP 600 

AGTDTPPIiAT HSSRPFLLTT IVARDADSGA NGEPLYSIRS GNEAHLFILN PHTGQLFVNV 660 

TNASSLIGSE WELEIWEDQ GSPPLQTRAL LRVMFVTSVD HLRDSARKPG ALSMSMLTVI 720 

70 CIAVLLG1FG LILALFMSIC RTEKKDNRAY NCREAESTYR QQPKRPQKHI QKADIHLVPV 780 

IiRGQAGEPCE VGQSHKDVDK EAMMEAGWDP CLQAPFHLTP TLYRTLRNQG NQGAPAESRE 840 

VLQDTVNLLF NHPRQRNASR ENLNLPEPQP ATGQPRSRPL KVAGSPTGRL AGDQGSEEAP 900 

QRPPASSATL RRQRHLNGKV SPEKESGPRQ ILRSLVRLSV AAFAERNPVE ELTVDSPPVQ 960 

QISQhLShLR QGQFQPKPNH RGNKYIiAKPG GSRSAIPDTD GPSARAGGQT DPEQEBGPLD 1020 

75 PEEDLSVKQL LBEELSSLLD PSTGIALDRL SAPDPAWMAR LSLPLTTNYR DNVISPDAAA 1080 

TEEPRTFQTF GKAEAPELSP TGTRLASTFV SBMSSLLEMb LEQRSSMPVE AASEALRRLS 1140 
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VCGRTLSLDL ATSAASGMKV QGDPGGKTGT EGKSRGSSSS SRCL 

Seq ID NO: 176 DNA sequence 

Nucleic Acid Accession #: AL109712.1 

Coding sequence: 2-128 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GAGTCTCTTT GGGCCAGCCG GGCTGCTGCA GACAGACAGG AAGCACGCCT GACGCTCCTC 60 

TACCCTCGGG CAGCACAGCG GGGCTGGGAC TCACTCTAGC TTGCCCAGCA ACTTGCTTTC 120 

CTGTGTGAAC TCTGGCAGGC TGCCCTCTCT GTGCAAAGCT GCCACTGGGG CCTGCTCAGG 180 

GTGGCCTGGA ACTTGGAGGT GGGCAGTCAG GGCCTAGGAT GGGCCTGTGT CACCAGGGCA 240 

TGTGCCCTTG GGCCAGTTAC TTCCTCTCAG AGCCTTGGGC TCCTCCTCTG AGGATGGGGC 300 

TTGTTGGTGT GAAATGAGGT GAGCATGTTG AGTTGGGGAG CAGCAGGACA CGCACCTGCA 360 

GGCAGCCGCC CTGGCCACGC TCCCTCCCTA CCTTCCGAGT CCTGGGACAG ACACAGTAGA 420 

GCACAGCGGG CCAGCCTGCT CTCTTCTCTG TCTACTTTTT GCAGAAGAGT CAACAGATAC 480 

AACAGGCCCA GGGAGGTGCC CCTGGGGGCC CCAGTCCCCA TCACTCCAAG GGGCAGTCCT 540 

GCAAGTGACA AGGTGGGCCC AATCCCTGTG GAACAGGTCT CTGAGGACCA CAGAGTGGGG 600 

CCCCAGGGAA AGCTGGGAGC CGAGCTAGAG GCAGGCAGCA AGTAAGGGCA AAGCTGTGCC 660 

CCTGCCCGGA AGACCTTCCT GCCCCCAGAA CCCGACCCTC CGCAGATAGC CCTCCCTGGG 720 

CAGCAGCCCC CCAGCTTCCA AGGCCCGTGC CTCACCAGAC GCCATGCTCT CACGGACTTG 780 

TTTGCTGCTC TGTACCCTGC AGATCTGCCC CAGAGGAGCA GGTGAAAAGC CGCGCCTGCC 840 

GAGGTGCTGT GGCGGTGGAG TTTTGGGCAG AGGAGTGGGG GGAAGAGTTT CTCACTTTTA 900 

AGATTCTCCA AATCCAAGAT GAAGTCATGC TGTCCTTTGG AATGGTAGAT GCTCATTTAT 960 
GTAAAATCAT AATAAATGTT ACACAAACTG TTAAAAAAAA AAAAAAAAAA AAAAAA 

Seq ID No: 177 Protein sequence: 
Protein Accession #: AL109712.1 

1 11 21 31 41 51 

I I I I I I 

VSIiGQPGCCR QTGSTPDAPL PSGSTAGLGL TIiACPATCFP V 

Seq ID NO: 178 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 3-107 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I 1 

AATGGAGCAC TCCAAAGAAC GATTTGACCA ATAGCATTTC TTCTCTGGGG GTTGTATTTC 60 

AAAGCATGCA ACTCTCCAGG GAACCAGAAC TAAATTGCTT AAAATGAAGT CATTCCTCAG 120 

ATTAACTTCC TCAGATAAAG TGTCAGCGGT CTGCAGAAAC GAAGAAGACA AAACTGAGAT 180 

TATCACTCAT AATTCTCTTA CTTACTATGT CAGTGAAACA ATGAGTTTGC ATTTTTGCAA 240 

TCCTAGAACA TTCTTCATTA GCCCTGGGTC ATGACCTCTT CCAGTTAATT CTCTTTCACA 300 

CCTTTAGGAA AGATTTAAGA TGAACCTTCA ATAGGATATT AACATAACTC ATAGCCAATA 360 

CCACAGCTGC CTTTCAAATT AATGAGGTTA ATTGTTCTCC AGCAAACATG AGTTTGTCTT 420 

TGG CATTTTA AATGCTTCCC ATTGATCTGA CATTTTGCTG TTTCAAGTTT TAAAGGGCTC 480 

AAATCAAAGA CTATTGATAA CTGAGCAAAG AGCGAAGATC CAGAAATACG AAAACATTGT 540 

CTTTTTTTTT CCATGAAAAA CAATCATAGC CTTTTGAATT CAATCGAAGT TTCTACATTA 600 

GCCATCTAAG ACTTATTTAA TTATTTCTGT TCTCAGTCAA GC7AATTCAA GTGAATGAAC 660 

AGTATTGACT TTTAAAATCT TTTTTAAATT TTTTTAAATC TTTAGTTTAT TAAGTTTGTA 720 

GAAAAGCTCT GGGGCCATGA CCACTTACGT AAATGTTTCA GTTTAAAAAC AAAAGATTCA 780 

GGCCTCTAAT TTGAGCCAAA TCCAGGTGAT CTTGTTTGAA ATTTTTGATG AATTTGAAAA 840 

GATGAAAGTG GAACTTTTAA CATTCATGTT CCCCAAATTT TTCACTGGGA AGGGATGCTA 900 

ATTGCCTACT TAAGATATAA GTTCAAGAAT AACATTTTCA TAGAAAATTC AGAAAACTGC 960 

TTGACACAGC AGTGACATAG TTAGATGTGG CTCAGATGCC TTCCAAACCT GAGGGTCCCC 1020 

AAAGATTTCT TTACCAGTTG TTTTTAACTA TGAATCTTAA TCTTGTTCAT TCCCCTGCCA 1080 
AAACAAATTT AAAAG 

Seq ID No: 179 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

I I I I I I 

WSTPKNDLTN SISSIjGWFQ SMQLSREPEL NCLK 

Seq ID NO: 180 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 2-176 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 
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CCGGGTGGGG CCTC6G6AT6 CAGGCGCCGG TGCCOGGGCC CCTGGGCCTG CTGGACCCCG 60 

CAGAAGGGCT TTCGAGGAGG AAGAAGACGT CGCTCTGGTT T6TGGGGTCT CTGCTGCTGG 120 

TGTCCGTCCT CATAGTCACC GTCGGGCTGG CTGCATCAGC AGGACGGAGA ATGTGACCGT 180 

TGGGGGCTAC TACCCAGGGA TCATTCTCGG CTTTGGATCT TTCTTAGGAA TTATTGGCAT 240 

5 CAACTTGGTG GAGAATAGAA GGCAAATGCT GGTGGCAGCG ATOGTGTTTA TCAGTTTTGG 300 

CGTGGTGGCC GCCTTCTGCT GCGCCATCGT GGACGGOGTA TTTGCAGCAC AGCACATTGA 360 

ACCGAGGCCC CTCACCACGG GAAGATGCCA GTTTTACTCC AGTGGGGTGG GGTACTTGTA 420 

CGATGTCTAC CAGACAGAGG TGAGCAGGAG CACTGAGATT CATGTGGGTT TTGCTCAGCT 480 

AACCCCGCCG ACCCCACGCG GTTTTCCCTG CACATAGGCG TGGTCTGAAT ATTTGGATTC 540 

10 TAATAGTTCC TGGGGGTCAC CCCTGCAGCT GGTGAACCGT TGATGCCCCC TGTGTAAGGG 600 

ACCTTGACAT TTCGATGTGC TGTATTTCAC TCTGGAGTCA GAGTTCTGGA CTTGCTTCAT 660 

TAAATCACAA CAGTCTCAGA AAACAACCGC ACCACCCCGC AATCCCACCA AAGGGGCGCG 720 
CCGTCCCTAA GAGTTATCCC 

15 Seq ID No: 181 Protein sequence; 
Protein Accession #: none found 

1 11 21 31 41 51 

on 1 1 1 1 1 1 

2U RVGPRDAGAG ARAPGPAGPR RRAFEEEEDV ALVCGVSAAG VRPHSHRRAG CISRTENV 

Seq ID NO: 182 DNA sequence 

Nucleic Acid Accession 8: AK001579.1 

Coding sequence: 1150-2637 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

TTTTCTCTGC TTTTCGCTAC CCCGGTCACT CTCATTTCTC TCCCCTATTC CTTGTCTCTT 60 

30 CCCCCATCCC CCTTTCTCCT GTCCTCCCCC TGCCTCTACA GTGGTTCTCC CCGCTGAGCT 120 

GCCACCAGCT GCTGGGCCCC GGGCTGCTGC GGCTGGGCCG CCTATGGCTG CGGTCCCCCT 180 

CCCATACAGC CCCGGCCCCT GGTCTCTGGC TGTCAGGGTT TGGCCTCCTT CGTGGTGACC 240 

ACCTCTTCCT GTGCTCAGCG CCGGGCCCAG GCCCCCCAGC CCCTGAGGAC ATGGTGCATC 300 

TGCGGCGGCT ACAGGAGATC AGTGTGGTTT CTGCAGCTGA CACCCCAGAT AAGAAAGAGC 360 

35 ATTTGGTCCT GGTGGAGACA GGAAGGACCC TGTATCTGCA AGGAGAGGGC CGGCTGGACT 420 

TCACGGCATG GAACGCAGCC ATTGGGGGCG CGGCTGGTGG GGGCGGCACA GGGCTGCAGG 480 

AGCAGCAGAT GAGCCGGGGT GACATCCCCA TCATCGTGGA TGCCTGCATC AGTTTTGTTA 540 

CCCAGCATGG GCTCCGGCTG GAAGGTGTAT ACCGGAAAGG GGGCGCTCGT GCCCGCAGCC 600 

TGAGACTCCT GGCTGAGTTC CGTCGGGATG CCCGGTCGGT GAAGCTCCGA CCAGGGGAGC 660 

40 ACTTTGTGGA GGATGTCACT GACACACTCA AAOGCTTCTT TCGTGAGCTC GATGACCCTG 720 

TGACCTCTGC ACGGTTGCTG CCTCGCTGGA GGGAGGCTGC TGGTATTCCT AAGATCCCTG 780 

AGAGCCAAGG CCCAACCAGG ATCTCTGCCT TCCCCCACCA GAATCCATGG TTTGGCAGCC 840 

CTCCGCCCCA TCACTTCCCA CCCTGGGGGA TCATCCAGAG ACTTGGCTCA GGGGGAGGTG 900 

GGAAGGGGGC AGAGACACAT CCATCCTGCA TTTGTGCCTA AAAATCCCTC CCTCTGTACC 960 

45 AGCTGCCACT CTTTCTTCCC GGGTCCTCCC CAACCCTCCT CCATTCCATC CCCAGAGCTG 1020 

CCCCAGAAGA ATCAGCGCCT GGAGAAATAT AAAGATGTGA TTGGCTGCCT GCCGCGGGTC 1080 

AACCGCCGCA CACTGGCCAC CCTCATTGGG CATCTCTATC GGGTGCAGAA ATGTGCGGCT 1140 

CTAAACCA GA TGTGCACGCG GAACTTGGCT CTGCTGTTTG CACCCAGCGT GTTCCAGACG 1200 

GATGGGCGAG GGGAGCACGA GGTGCGAGTG CTGCAAGAGC TCATTGATGG CTACATCTCT 1260 

50 GTCTTTGATA TCGATTCTGA CCAGGTAGCT CAGATTGACT TGGAGGTCAG TCTTATCACC 1320 

ACCTGGAAGG ACGTGCAGCT GTCTCAGGCT GGAGACCTCA TCATGGAAGT TTATATAGAG 1380 

CAGCAGCTCC CAGACAACTG TGTCACCCTG AAGGTGTCCC CAACCCTGAC TGCTGAGGAG 1440 

CTGACTAACC AGGTACTGGA GATGCGGGGG ACAGCAGCTG GGATGGACTT GTGGGTGACT 1500 

TTTGAGATTC GCGAGCATGG GGAGCTGGAG CGGCCACTGC ATCCCAAGGA AAAGGTCTTA 1560 

55 GAGCAGGCTT TACAATGGTG CCAGCTCCCA GAGCCCTGCT CAGCTTCCCT GCTCTTGAAA 1620 

AAAGTCCCCC TGGCCCAAGC TGGCTGCCTC TTCACAGGTA TCCGAOGTGA GAGCCCACGG 1680 

GTGGGGCTGT TGCGGTGTCG TGAGGAGCCA CCTOGCTTGC TGGGAAGCCG CTTCCAGGAG 1740 

AGGTTCTTTC TGCTGCGTGG CCGCTGCCTG CTGCTGCTCA AGGAGAAGAA AAGCTCTAAA 1800 

CCAGAACGGG AGTGGCCTTT GGAAGGTGCC AAGGTCTACC TGGGAATCCG CAAGAAGTTA 1860 

60 AAGCCCCCAA CACCGTGGGG CTTCACATTG ATACTAGAGA AGATGCACCT CTACTTGTCC 1920 

TGCACTGACG AGGATGAAAT GTGGGATTGG ACCACCAGCA TCCTTAAAGC CCAGCACGAT 1980 

GACCAGCAGC CAGTGGTCTT ACGACGCCAT TCCTCCTCTG ACCTTGCCCG TCAGAAGTTT 2040 

GGCACTATGC CTTTGCTGCC TATCCGTGGG GATGACAGTG GAGCCACCCT CCTCTCTGCC 2100 

AATCAGACCC TGCGGCGACT ACACAACCGG AGGACCCTGT CCATGTTCTT TCCAATGAAG 2160 

65 TCATCCCAGG GGTCTGTGGA GGAGCAAGAG GAGCTGGAGG AGCCTGTGTA CGAGGAGCCA 2220 

GTGTATGAGG AAGTAGGGGC CTTCCCTGAG TTGATCCAGG ACACTTCTAC CTCCTTCTCC 2280 

ACCACACGGG AGTGGACAGT GAAGCCAGAG AACCCCCTCA CCAGCCAGAA GTCATTGGAT 2340 

CAACCCTTTC TCTCCAAGTC AAGCACCCTT GGCCAGGAGG AGAGGCCACC TGAGCCCCCT 2400 

CCAGGCCCCC CTTCAAAGAG CAGTCCCCAG GCACGGGGGT CCCTAGAGGA ACAGCTGCTC 2460 

70 CAGGAGCTCA GCAGCCTCAT CCTGAGGAAA GGAGAGACCA CTGCAGGCCT GGGAAGTCCT 2520 

TCCCAGCCAT CCAGCCCCCA ATCCCCCAGC CCCACTGGCC TTCCAACACA GACACCTGGC 2580 

TTCCCCACCC AACCCCCATG CACTTCCAGT CCACCCTCCA GCCAGCCCCT CACATGACCC 2640 

TAGGACCAGC AGTCTGAGAG GGTAGGTACC AGAAGACCCA GAAACTCTTA TCGTGGCACT 2700 

GTTGCAGCTT CCTCTGCCCT GGCTGGAAAG ACTCCAGAAT CCAGTGTGGT GCTGTGGAAG 2760 

75 GAGCACTGGA CTAAAGGCTT CAGTGGCTGC GTGTCCCAGG ACAGGTCATG GCCCCTCTCT 2820 

GGGCCCAGCC CATTTATCTA TACCATGAGG TAACTGAAGT AAGGAGAGCA GTGAATGTCA 2880 



265 



WO 02/079492 



PCT/US02/04915 



AACTGTGTTT CTTAGAGCCA TAAGCCCCAC ATATTATCCC TGAACAAGGG CAGCTCCTGC 2940 

TTTATATATT TGATACGTAG GGGTTCCATG AGAGATTTTG GGTTTTAAAG GAA TGGTTTT 3000 

ACTGCATTAA AGAAAAAAAA TGCTTTGGAA ACCAGAGGCC TGGGTGATGT TAAAGTCTAT 3060 

CCTGTCCCAC TTCCTACATT CTGGGACTAC CGTGAAGCCT GGAGTAGGGA GAGCGAGTTT 3120 
GGGAGCTGGG ACTCGGGGAG TCAAAAATAG ATGAGTAATT GTCAATAAAC CTGGGAACC 



Seq ID No: 183 Protein sequence: 
Protein Accession #: AK001579.1 



10 



15 



20 



25 



MSLTHSNASP 
SHPGGSSRDL 
RLEKYKDVIG 
HBVRVLQELI 
NCVTLKVSPT 
WCQLPEPCSA 
RGRCLLLLKE 
EMWDWTTSIL 
RLHNRRTLSM 
TVKPENPLTS 
LILRKGETTA 



11 
I 

VSSMTLPLHG 
AQG5VGRGQR 
CI»PRVNRRTL 
DGYISVFDID 
LTAEELTNQV 
ShLLKKWhA 
KKSSKPSREW 
KAQHDDQQPV 
FFPMKSSQGS 
QKSLDQPFLS 
GLGSPSQPSS 



21 
I 

CCLAGGRLLV 
HIHPAFVPKN 
ATLIGHLYRV 
SDQVAQIDLE 
LEMRGTAAGM 
QAGCLFTGIR 
PLEGAKVYLG 
VLRRHSSSDL 
VEEQBELEBP 
KSSTLGQEER 
PQSPSPTGLP 



31 
I 

FLRSLRAKAQ 
PSLCTSCHSF 
QKCAALNQMC 
VSLITT WKD V 
DLWVTFEIRE 
RESPRVGLLR 
IRKKLKPPTP 
ARQKFGTMPL 
VYEEPVYEEV 
PPEPPPGPPS 
TQTPGFPTQP 



41 
I 

PGSLPSPTRI 
FPGPPQPSSI 
TRNLALlLFAP 
QLSQAGDLIM 
HGELERPIiHP 
CREEPPRIiLG 
WGFTLILBKM 
LPIRGDDSGA 
GAFPEIiIQDT 
KSSPQARGSL 
PCTSSPPSSQ 



SI 

I 

HGLAALRPIT 
PSPBLPQKNQ 
SVFQTDGRGE 
EVYIEQQLPD 
KEKVLEQALQ 
SRFQERFFLL 
HLYIiSCTDED 
TLLSANQTLR 
STSFSTTREW 
EEQLLQELSS 
PLT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



Seq ID NO: 184 DNA sequence 

Nucleic Acid Accession #: none found 

Coding sequence: 1-81 (underlined sequences correspond to start and stop codons) 



30 



35 



40 



45 



50 



55 



60 



i 

I ■ 

GTAGAGTTAG 
GTATTAAGCT 
AATCTCAACA 
CTTTCTCTAT 
AACCCTAACT 
CACTGCACCA 
ACTGAAGATG 
AACATGTAAT 
TGTTCAGCTG 
AAGTTTTGTT 
ACAAAGGCAT 
GGATGGCAGT 
TGTGCCTCTT 
CCATTAGCTA 
GTCATGTGCT 
TTGCTTGGTT 
ACTTGGATGC 
TTCTTTTGTC 
CAGGATGGTT 
AGTAAGCTTT 
TGGCTTTTCA 
GCTTATTCTT 
CACCTCCTTG 
GCACTTAAAA 
TGCAGGTTAG 
CACCGTTCTA 
GACACAGTCA 
TTTTGCAGAA 



11 

I 

TGTCAATGTG 
TAAAAAGTTA 
TAAGAAGTCA 
AATTTCATCA 
CTGCTAATTA 
GCTTTGTTAT 
AGAGAACATG 
GAATGTAGTA 
TAACAGAATA 
ACGAATTCAG 
CTTTCCTGAT 
TCCAGCCCTG 
CACTTTAATC 
ATGCTTGCTT 
CAATTAATAT 
GCATTCTTCT 
CTCAGTTGTC 
ATCAGCACCA 
GCAACAACCA 
TGAAAATGTA 
TATTACTCAA 
CCTCTTACTT 
CTGCTTGTCC 
TAGAAAAAAA 
AAGTCATGGA 
TTCAACCCCA 
GCTCTTTCTG 
GAAAAC 



21 
I 

CTTAGAATAT 
_ATTCAGTTTA 
AAATGTAATG 
GTATGTCCTC 
TAAGCTAGGC 
CTGTAAAATG 
ATATGTGTAA 
ATAGTAATTA 
CCCAAAATAA 
ACAATCCAGG 
TTCTGCCAGT 
GTTACGCCCA 
ATAGCTCCCA 
ACATGGTCAC 
CCAAGTGTCC 
TAGCATAAGC 
CTTTCATTTA 
TCACTACCAC 
CCATAGGGAC 
GGTCAGATCA 
AAGAAAACCT 
TATCTCTGTA 
TATACTCCTA 
AAAAAAAAAA 
GCTGGGATCT 
TTGCCTAGAG 
AGAAAAGGCA 



31 
I 

ACCAAATTCA 
AGGAATATAA 
CTGCCAGATA 
TCCCTTTTCT 
AAGTAATCTT 
ATGATAATAC 
AGTGCCTTCC 
TTTTATTTTC 
CAGTTTTAAA 
GCTTTTATAG 
CTCAATGCAT 
TATTAGCACA 
CTAGATGCAC 
ACTTAGTTTC 
AATTACTGAG 
CACATTCTTT 
GAAATGCTCC 
TGCCTTCTTC 
TTTTTGCTTC 
TGTCTCTCTC 
AAAACTTTGC 
TTGCTCTTCC 
AAAGAAGTTC 
AGCTCAGAGA 
AAATCCATGT 
GTGCTTGATT 
GCTCAGCATT 



41 
I 

TAAACATTTT 
ACCAAATTAT 
ACAATATCAA 
CCTATTTGTC 
GGACAAGTTA 
CAACACCTTC 
ACAATACCCA 
TTTTGATTCA 
CAAATTAAAG 
ATGCACCAGG 
GGGTTGCAAT 
CAGAAAGAAA 
CCACTACTTC 
CAGAGAGACA 
AAAAAAAGAA 
TTATGAAGTT 
TTGGACATCC 
AAAGCCACCA 
TACTTCCACA 
TTCTCTTCAA 
TGTGAGATCT 
TCACTCTACT 
AGTCTTCCCT 
GGCTGAGTTG 
CAGTCTGACT 
GCTCAATAAT 
TCCATGAGAT 



51 
I 

CTCTAAAAAA 
TTTATATTTG 
AGGTATTTTT 
AAATTTTAGC 
TTTGACCTCT 
TTCTTGGGGT 
GAACATAGCA 
GTTGGGACTA 
TTTTGTTGTG 
ATCAGCAGGT 
CCAGAGTCCA 
GAGAAAGGGA 
TGCTGATACT 
TGTCTGGACA 
ACTAGCACCT 
GTCCTCAGTT 
TGAATCTGAC 
CGTTCTGTCC 
CAATAGCCAG 
AACCCTCCGA 
ATGTGACCCG 
CCAGCCATCC 
TATGATATTT 
TCCAAGGTCA 
ATGAGTTCTG 
AGATTCCATG 
CCGCACATCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 



Seq ID No: 185 Protein sequence: 
Protein Accession #: none found 



65 i n 21 

I I I 

VELVSMCLEY TKFINIFSKK VLSLKS 



31 



41 
I 



51 



Seq ID NO: 186 DNA sequence 
70 Nucleic Acid Accession #: NMJ>02203.2 

Coding sequence: 43-3588 (underlined sequences correspond to start and stop codons) 



75 



li 



21 



31 



41 



51 



CTGCAAACCC AGCGCAACTA CGGTCCCCCG GTCAGACCCA GGATGGGGCC AGAACGGACA 
GGGGCCGCGC CGCTGCCGCT GCTGCTGGTG TTAGCGCTCA GTCAAGGCAT TTTAAATTGT 



60 
120 



266 



WO 02/079492 



PCT/US02/04915 



TGTTTGGCCT ACAATGTTGG TCTCCCAGAA 
CAGTTTGGGT ATGCAGTGCA GCAGTTTATA 
TCACCCTGGA GTGGCTTTCC TGAGAACCGA 
CTATCCACTG CCACATGTGA AAAACTAAAT 
5 ACTGAGATGA AAACCAACAT GAGCCTCGGC 
GGTTTTCTCA CATGTGGTCC TCTGTGGGCA 
GGTGTGTGTT CTGACATCAG TCCTGATTTT 
CAGCCCTGCC CTTCCCTCAT AGATGTTGTG 
CCTTGGGATG CAGTAAAGAA TTTTTTGGAA 

10 ACAAAGACAC AGGTGGGGTT AATTCAGTAT 
AACACATATA AAACCAAAGA AGAAATGATT 
GGGGACCTCA CAAACACATT CGGAGCAATT 
GCTTCTGGTG GGCGACGAAG TGCTACGAAA 
CATGATGGTT CAATGTTGAA AGCTGTGATT 

15 TTTGGCATAG CAGTTCTTGG GTACTTAAAC 
AAAGAAATAA AAGCGATCGC TAGTATTCCA 
GAAGCAGCTC TACTAGAAAA GGCTGGGACA 
ACTGTTCAAG GAGGAGACAA CTTTCAGATG 
TACTCTTCTC AAAATGATAT. TCTGATGCTG 

20 ACCATTGTCC AGAAGACATC TCATGGCCAT 
ATTCTGCAGG ACAGAAATCA CAGTTCATAT 
GGAGAAAGCA CTCACTTTGT TGCTGGTGCT 
CTATATAGTG TGAATGAGAA TGGCAATATC 
ATTGGCTCCT ATTTTGGTAG TGTGCTGTGT 

25 GACGTGCTCT TGGTAGGTGC ACCAATGTAC 
GTCTACCTGT TTACTATCAA AAAGGGCATT 
GAGGGCATTG AAAACACTCG ATTTGGTTCA 
GATGGCTTTA ATGATGTGAT TGTTGGTTCA 
TACATTTACA ATGGTCATCA GGGCACTATC 

30 TCCGATGGAG CCTTTAGGAG CCATCTCCAG 
GATTTAAATG GGGATTCCAT CACCGATGTG 
CTCTGGTCAC AAAGTATTGC TGATGTAGCT 
ACTTTGGTCA ACAAGAATGC TCAGATAATT 
CCTACTAAGC AAAACAATCA AGTGGCCATT 

35 TTTTCATCGA GAGTAACCTC CAGGGGGTTA 
AAGAATATGG TAGTAAATCA AGCACAGAGT 
CCCTCTGATG TTGTCAACTC TTTGGATTTG 
ACTAGCCCTG CCCTTGAAGC CTATTCTGAG 
AAAGACTGTG GTGAGGATGG ACTTTGCATT 

40 CCAGCTGCTC AAGAACAACC CTTTATTGTC 
GTAACACTGA AAAATAAAAG GGAAAGTGCA 
GAAAACTTGT TTTTTGCATC ATTCTCCCTA 
GTGGCTGCAT CTCAGAAGTC TGTTGCCTGC 
CAACAGGTGA CTTTTACTAT TAACTTTGAC 

45 TCTCTCAGTT TCCAAGCCTT AAGTGAAAGC 
AACCTCAAAA TTCCTCTCCT GTATGATGCT 
AATTTTTATG AAATCTCTTC GGATGGGAAT 
GTTGGTCCAA AATTCATCTT CTCCCTGAAG 
GCAACTGTAA TCATCCACAT CCCTCAGTAT 

50 ACTGGGGTGC AAACAGACAA GGCTGGTGAC 
AAAATAGGAC AAACATCTTC TTCTGTATCT 
GAATTGAACT GCAGAACTGC TTCCTGTAGT 
ATGAAAGGAG AATACTTTGT TAATGTGACT 
TCAACGTTCC AGACAGTACA GCTAACGGCA 

55 ATATATGTGA TTGAAGATAA CACTGTTACG 
AAAGCCGAAG TACCAACAGG AGTTATAATA 
TTAGCTCTGG TTGCAATTTT ATGGAAGCTC 
ACCAAAAATC CAGATGAGAT TGATGAGACC 
ACCTGCAGTG GGAACCGGCA GCATCCCAGC 

60 TTTTTAAATC CCATATTTTT TTTATCATGT 
AAAACTGCAG GTCAGTTTGG ATGAAGAAAT 
GTAGGGAAAT AATAGGGAAA ATACCTATTT 
AACTGGCTGG CCCAGAGTTT ACATTCTAAT 
CAAGCATGAC AACTTTTAAA GAAAAATATG 

65 TCTCTTTAAA ATATTTGTCT TTAAACAGCA 
GTACTTCCAC TTGTGTATAT TTTAATGAAT 
ACAGGTTTTT TCAATTTATG CTGCTCATCC 
TAATTTTATT TATAAACTAG GTAAAATTTG 
CTTCCACACC CCATCTTGCT CTAATGATCA 

70 TACCTCCTAT ATGTCCATTT AAGTTAGGAG 
TTTTGTTTAA AACTCAGAAT ATAACATTTA 
. TGTGCCAGAG GAAGGAAAAG GAGGAAATTT 
CTTCTAGGAT TTGTTTGGCT GACTGGCAGT 
TTCTTTGGCA ACCTTCCTCC TCCCTTACTG 

75 TATTATAGAA GCCCTCTACA GCCTGACTTT 
TTACCCCTCA TCCAAAGTTC CCACTCCTTC 



GCAAAAATAT TTTCCGGTCC TTCAAGTGAA 180 

AATCCAAAAG GCAACTGGTT ACTGGTTGGT 240 

ATGGGAGATG TGTATAAATG TCCTGTTGAC 300 

TTGCAAACTT CAACAAGCAT TCCAAATGTT 360 

TTGATCCTCA CCAGGAACAT GGGAACTGGA 420 

CAGCAATGTG GGAATCAGTA TTACACAACG 480 

CAGCTCTCAG CCAGCTTCTC ACCTGCAACT 540 

GTTGTGTGTG ATGAATCAAA TAGTATTTAT 600 

AAATTTGTAC AAGGCCTTGA TATAGGCCCC 660 

GCCAATAATC CAAGAGTTGT GTTTAACTTG 720 

GTAGCAACAT CCCAGACATC CCAATATGGT 780 

CAATATGCAA GAAAATATGC CTATTCAGCA 840 

GTAATGGTAG TTGTAACTGA CGGTGAATCA 900 

GATCAATGCA ACCATGACAA TATACTGAGG 960 

AGAAACGCCC TTGATACTAA AAATTTAATA 1020 

ACAGAAAGAT ACTTTTTCAA TGTGTCTGAT 1080 

TTAGGAGAAC AAATTTTCAG CATTGAAGGT 1140 

GAAATGTCAC AAGTGGGATT CAGTGCAGAT 1200 

GGTGCAGTGG GAGCTTTTGG CTGGAGTGGG 1260 

TTGATCTTTC CTAAACAAGC CTTTGACCAA 1320 

TTAGGTTACT CTGTGGCTGC AATTTCTACT 1380 

CCTCGGGCAA ATTATACCGG CCAGATAGTG 1440 

ACGGTTATTC AGGCTCACCG AGGTGACCAG 1500 

TCAGTTGATG TGGATAAAGA CACCATTACA 1560 

ATGAGTGACC TAAAGAAAGA GGAAGGAAGA 1620 

TTGGGTCAGC ACCAATTTCT TGAAGGCCCC 1680 

GCAATTGCAG CTCTTTCAGA CATCAACATG 1740 

CCACTAGAAA ATCAGAATTC TGGAGCTGTA 1800 

CGCACAAAGT ATTCCCAGAA AATCTTGGGA 1860 

TACTTTGGGA GGTCCTTGGA TGGCTATGGA 1920 

TCTATTGGTG CCTTTGGACA AGTGGTTCAA 1980 

ATAGAAGCTT CATTCACACC AGAAAAAATC 2040 

CTCAAACTCT GCTTCAGTGC AAAGTTCAGA 2100 

GTATATAACA TCACACTTGA TGCAGATGGA 2160 

TTTAAAGAAA ACAATGAAAG GTGCCTGCAG 2220 

TGCCCCGAGC ACATCATTTA TATACAGGAG 2280 

CGTGTGGACA TCAGTCTGGA AAACCCTGGC 2340 

ACTGCCAAGG TCTTCAGTAT TCCTTTCCAC 2400 

TCTGATCTAG TCCTAGATGT CCGACAAATA 2460 

AGCAACCAAA ACAAAAGGTT AACATTTTCA 2520 

TACAACACTG GAATTGTTGT TGATTTTTCA 2580 

CCGGTTGATG GGACAGAAGT AACATGCCAG 2640 

GATGTAGGCT ACCCTGCTTT AAAGAGAGAA 2700 

TTCAATCTTC AAAACCTTCA GAATCAGGCG 2760 

CAAGAAGAAA ACAAGGCTGA TAATTTGGTC 2820 

GAAATTCACT TAACAAGATC TACCAACATA 2880 

GTTCCTTCAA TCGTGCACAG TTTTGAAGAT 2940 

GTAACAACAG GAAGTGTTCC AGTAAGCATG 3000 

ACCAAAGAAA AGAACCCACT GATGTACCTA 3060 

ATCAGTTGTA ATGCAGATAT CAATCCACTG 3120 

TTCAAAAGTG AAAATTTCAG GCACACCAAA 3180 

AATGTTACCT GCTGGTTGAA AGACGTTCAC 3240 

ACCAGAATTT GGAACGGGAC TTTCGCATCA 3300 

GCTGCAGAAA TCAACACCTA TAACCCTGAG 3360 

ATTCCCCTGA TGATAATGAA ACCTGATGAG 3420 

GGAAGTATAA TTGCTGGAAT CCTTTTGCTG 3480 

GGCTTCTTCA AAAGAAAATA TGAAAAGATG 3540 

ACAGAGCTCA GTAGCTGAAC CAGCAGACCT 3600 

CAGGGTTTGC TGTTTGCGTG CATGGATTTC 3660 

CGTAGGTAAA CTAACCTGGT ATTTTAAGAG 3720 

TGTGGGGGGT GGGGGAGGTG OGGGGGGCAG 3780 

TATATGATGG GGGAAAAAAA GTAATCTTTA 3840 

TTGCATTGTG TCAGAAACAT GAAATGCTTC 3900 

ATACTCTCAG ATTTTAAGGG GGAAAACTGT 3960 

ACTACAGAAG TGGAAGTGCT TGATATGTAA 4020 

ATTGATGTTA ACAAGAGGGG AAAACAAAAC 4080 

AAAGTTGCCA CAGATGATAC TTCCAAGTGA 4140 

TTGTTGGTTC CTTTTATACC ACGGCTGCCC 4200 

AAACATGCTT GAATAACTGA GCTTAGAGTA 4260 

AGGGGGCGAT ATAGAGACTA AGGCACAAAA 4320 

TGTAAAATCC CATCTGCTAG AAGCCCATCC 4380 

CCTTTCTCTT TTAGGAGGCA CAACAGTTCT 4440 

AACCTAGTGA ATTTTTGAAA GATGAGTAAT 4500 

AACCACTCTC CCACCTCCTG GTGGTACCAT 4560 

CTCTCCAGCG GTCCAAAGTT ATCCCCTCCT 4620 

AGGACAGCTG CTGTGCATTA GATATTAGGG 4680 



267 
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30 



35 



40 



GGGAAAGTCA 
TTCAGGGAGC 
CTAAATGTTG 
CCAGAAGTTA 
GCTGTCTTGT 
GTTCAAAAGG 
TTTTAACAAG 
ATTTCTACTT 
CTTAGATTAA 
AAACTTCAGA 
TTTAGTTTTA 
CAAGAATTTG 



TCTGTTTAAT 
TATTTTCATT 
6AATGTTATG 
CAGATGAGGC 
TTCTGAAGTA 
TAGATCCTGA 
CACCCCAGTC 
TTTGCACCTT 
AATTCACAGA 
GTCCTCATTA 
AAAGTCTATG 
ACTTGGAAAA 



TTACACACTT 
TAGTGCTAAA 
GGATGTAAAC 
ACTGGAAACC 
CTTTTTCTTC 
GATGATTTGG 
ACTAGGATGC 
ATTTTCTCTG 
CACTACATAT 
TAAAATGGGA 
ATCTGATCTG 
G 



GCATGAATTA 
CAAGTAAGAA 
AATGTAAAGT 
ACCACCAAAT 
CACAAGAGTG 
TCAGATTGGG 
AGATGGACCA 
TTCCTGAGCC 
CTAAAGCTTT 
AGACTGAGCT 
GACTTCCTAT 



Seq ID NO: 187 protein sequence t 
Protein Accession #: NP_002194.1 



MGPERTGAAP 
NWIiLVGSPWS 
RNMGTGGFLT 
ESNSIYPWDA 
QTSQYGGDLT 
HDN1LRFGIA 
IFSIEGTVQG 
KQAFDQILQD 
AHRGDQIGSY 
QFLEGPEGIE 
SQKILGSDGA 
FTPEKITLVN 
NERCLQKNMV 
PSIPPHKDCG 
IWDFSENLP 
NLQNQASLSP 
VHSFEDVGPK 
ADINPLKIGQ 
NGTFASSTFQ 
AGILLLLALV 



11 

I 

LPLLLVLALS 
GFPENRMGDV 
CGPLWAQQCG 
VKNFLEKFVQ 
NTFGAIQYAR 
VLGYLNRNAL 
GDNFQMEMSQ 
RNHSSYLGYS 
FGSVLCSVDV 
NTRFGSAIAA 
FRSHLQYFGR 
KNAQIILKLC 
VNQAQSCPEH 
EDGLCISDLV 
FASFSLPVDG 
QALSESQEEN 
FIFSLKVTTG 
TSSSVSFKSE 
TVQLTAAAEI 
AILWKLGFFK 



21 
I 

QGILNCCIiAY 
YKCPVDLSTA 
NQYYTTGVCS 
GLDIGPTKTQ 
KYAYSAASGG 
DTKNLIKEIK 
VGFSADYSSQ 
VAAISTGEST 
DKDTITDVLL 
LSDINMDGFN 
SLDGYGDLNG 
FSAKFRPTKQ 
IIYIQEPSDV 
LDVRQIPAAQ 
TEVTCQVAAS 
KADNLVNLKI 
SVPVSMATVI 
NFRHTKELNC 
NTYNPEIYVI 
PJCYEKMTKNP 



31 
I 

NVGLPEAKIF 
TCEKLNLQTS 
DISPDFQLSA 
VGLIQYANNP 
RRSATKVMW 
AIASIPTERY 
NDILMLGAVG 
HFVAGAPRAN 
VGAPMYMSDL 
DVIVGSPLEN 
DSITDVSIGA 
NNQVAIVYNI 
VNSLDLRVDI 
BQPFIVSNQN 
QKSVACDVGY 
PLLYDAEIHL 
IHIPQYTKEK 
RTASCSNVTC 
EDNTVTIPLM 
DEIDETTELS 



CTGTATATAA 
AAATAAGCTA 
AAAACACTCT 
TAGCAGGTGC 
AATTTGACCT 
ATAAGGCCCA 
CACTTTGAGA 
CCCACATTCT 
GACAAGTCCT 
GGAGTTCAGC 
AATACAAATA 



41 
I 

SGPSSEQFGY 
TSIPNVTEMK 
SFSPATQPCP 
RWFNLNTYK 
VTDGESHDGS 
FFNVSDEAAL 
AFGWSGTIVQ 
YTGQIVfcYSV 
KKEEGRVYLF 
QNSGAVYIYN 
FGOVVQLWSQ 
TLDADGFSSR 
SLENPGTSPA 
KRLTFSVTLK 
PALKREQQVT 
TRSTNINFYE 
NPLMYLTGVQ 
WLKDVHMKGE 
IMKPDEKAEV 
S 



ACTCCTTAAC 
GAGTGAATTT 
CAGGATTTCA 
ACCTTCTGTG 
AGGCAAGTTT 
GCAATCTGCA 
AACACCACCC 
CTAGGAGAAA 
TGACCTCTAT 
AGTGATGCTT 
CACAATCCTC 



51 
I 

AVQQFINPKG 
TNKSIiGLILT 
SLIDWWCD 
TKEEMIYATS 
MLKAVIDQCN 
LEKAGTLGEQ 
KTSHGHLIFP 
NENGNITVIQ 
TIKKGILGQH 
GHQGTIRTKY 
SIADVAIEAS 
VTSRGLFKEN 
LEAYSETAKV 
NKRESAYNTG 
FTINFDFNIjQ 
ISSDGNVPSI 
TDKAGDISCN 
YFVNVTTRIW 
PTGVIIGSII 



4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 



Seq ID NO: 188 DNA sequence 

Nucleic Acid Accession #: NM_002210.1 

Coding sequence: 42-3188 (underlined sequences correspond to start and stop codons) 
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50 



55 



60 



65 



70 



75 



GGCTACCGCT 
GACGGCTGCG 
TGTGCCGCGC 
GTTACTTCGG 
TCGTGGGAGC 
TCAAATGTGA 
ATAGAGATTA 
CTGTGAGGTC 
AGATGAAACA 
TTGAGTATGC 
GAGGATTCAG 
TTTATTGGCA 
CCAATGTTTA 
TTGATGACAG 
ATGACTTTGT 
ATGGGAAGAA 
GATTTTCTGT 
CACCTCTCTT 
TGTCTCTACA 
TTGCACGGTT 
ATATTGCAAT 
ATGGAAGATC 
CTCGAAGCAT 
ATGGATATCC 
CCAGACCAGT 
ACAATAAAAC 
TCTGCTTAAA 
TTCTTTTGGA 
GGTCCCCAAG 
AATTGATAGC 
TTTTTATGGA 



11 
I 

CCCGGCTTGG 

CTTCAACCTA 
CTTCGCCGTG 
TCCCAAAGCA 
CTGGTCTTCT 
TGCCAAGGAT 
GAAACAGGAT 
GGAGCGAGAG 
TCCATGTAGA 
CATTGATTTT 
AGGTCAGCTT 
CAGCATCAAG 
CTATTTGGGT 
TTCAGGAGTT 
CATGTCCTCC 
AGCTGCCACT 
CATGGATCGT 
GAGAGCTTCA 
TGGCAGTGCC 
TGCTGCTCCA 
AACAGGCTTG 
GCCACCAAGC 
AGACTTAATT 
TATCACTGTA 
CTGCTCACTG 
GGCAGATGGC 
TAAACTCAAG 
TCACTCCAAG 
GTATCTGCGG 
ATATCGGTTG 



21 
I 

CGTCCCXSCGC 



GACX3TGGACA 
GATTTCTTCG 
AACACCACCC 
ACCCGCCGGT 
GATCCATTGG 
AAAATTTTGG 
CCTGTTGGAA 
TCACAAGATA 
ACTAAAGCTG 
ATTTCGGATC 
TATAATAACC 
TATTCTGTGG 
CCAAGAGCAG 
TTATACAATT 
GACATTAATG 
GGCTCTGATG 
GGAGACTTCC 
ATAGCTCCTT 
TATGGGGGTG 
AACGCAGTCC 
TTTGGCTATT 
GTAGGAGCTT 
AATGCTGGTC 
CCTGGAACAG 
AAAGGAGTAC 
CAAAAGGGAG 
AACATGACTA 
GATGAATCTG 
GATTATAGAA 



31 
I 

GCACTTCGGC 
CGCTTCTTCT 
GTCCTGCCGA 
TGCCCAGCGC 
AGCCTGGGAT 
GCCAGCCAAT 
AATTTAAGTC 
CCTGTGCCCC 
CATGCTTTCT 
TTGATGCTGA 
ACAGAGTACT 
AAGTGGCAGA 
AATTAGCAAC 
CTGTCGGAGA 
CAAGGACTTT 
TTACTGGCGA 
GAGATGATTA 
GCAAACTCCA 
AGACGACAAA 
TGGGAGATCT 
AAGATAAAAA 
CATCTCAAAT 
CAATGAAAGG 
TTGGTGTAGA 
TTGAAGTGTA 
CTCTCAAAGT 
TTCCCAGGAA 
CAATTCGACG 
TTTCAAGGGG 
AATTTAGAGA 
CAGCTGCTGA 



41 

I 

GATGGCTTTT 
CTCGGGACTC 
GTACTCTGGC 
GTCTTCCCGG 
TGTGGAAGGA 
TGAATTTGAT 
CCATCAGTGG 
ATTGTACCAT 
TCAAGATGGA 
TGGACAGGGA 
TCTTGGTGGT 
AATCGTATCT 
TCGGACTGCA 
TTTCAATGGT 
GGGAATGGTT 
GCAGATGGCT 
TGCAGATGTG 
AGAGGTGGGG 
GCTGAATGGA 
GGACCAGGAT 
AGGAATTGTT 
CCTTGAAGGG 
AGCCACAGAT 
TCGAGCTATC 
CCCTAGCATT 
TTCCTGTTTT 
ACTTAATTTC 
AGCACTGTTT 
GGGACTGATG 
CAAACTCACT 
TACAACAGGC 



51 
I 

CCGCCGCGGC 
CTGCTACCTC 
CCCGAGGGAA 
ATGTTTCTTC 
GGGCAGGTCC 
GCAACAGGCA 
TTTGGAGCAT 
TGGAGAACTG 
ACAAAGACTG 
TTTTGTCAAG 
CCTGGTAGCT 
AAATACGACC 
CAAGCTATTT 
GATGGCATAG 
TATATTTATG 
GCATATTTCG 
TTTATTGGAG 
CAGGTCTCAG 
TTTGAGGTCT 
GGTTTCAATG 
TATATCTTCA 
CAGTGGGCTG 
ATAGACAAAA 
TTATACAGGG 
TTAAATCAAG 
AATGTTAGGT 
CAGGTGGAAC 
CTCTACAGCA 
CAGTGTGAGG 
CCAATTACTA 
TTGCAACCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
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TTCTTAACCA GTTCACGCCT GCTAACATTA GTCGACAGGC TCACATTCTA CTTGACTGTG 1920 

GTGAAGACAA TGTCTGTAAA CCCAAGCTGG AAGTTTCTGT AGATAGTGAT CAAAAGAAGA 1980 

TCTATATTGG GGATGACAAC CCTCTGACAT TGATTGTTAA GGCTCAGAAT CAAGGAGAAG 2040 

GTGCCTACGA AGCTGAGCTC ATCGTTTCCA TTCCACTGCA GGCTGATTTC ATCGGGGTTG 2100 

5 TCCGAAACAA TGAAGCCTTA GCAAGACTTT CCTGTGCATT TAAGACAGAA AACCAAACTC 2160 

GCCAGGTGGT ATGTGACCTT GGAAACCCAA TGAAGGCTGG AACTCAACTC TTAGCTGGTC 2220 

TTCGTTTCAG TGTGCACCAG CAGTCAGAGA TGGATACTTC TGTGAAATTT GACTTACAAA 2280 

TCCAAAGCTC AAATCTATTT GACAAAGTAA GCCCAGTTGT ATCTCACAAA GTTGATCTTG 2340 

CTGTTTTAGC TGCAGTTGAG ATAAGAGGAG TCTCGAGTCC TGATCATATC TTTCTTCCGA 2400 

10 TTCCAAACTG GGAGCACAAG GAGAACCCTG AGACTGAAGA AGATGTTGGG CCAGTTGTTC 2460 

AGCACATCTA TGAGCTGAGA AACAATGGTC CAAGTTCATT CAGCAAGGCA ATGCTCCATC 2520 

TTCAGTGGCC TTACAAATAT AATAATAACA CTCTGTTGTA TATCCTTCAT TATGATATTG 2580 

ATGGACCAAT GAACTGCACT TCAGATATGG AGATCAACCC TTTGAGAATT AAGATCTCAT 2640 

CTTTGCAAAC AACTGAAAAG AATGACACGG TTGCCGGGCA AGGTGAGOGG GACCATCTCA 2700 

15 TCACTAAGCG GGATCTTGCC CTCAGTGAAG GAGATATTCA CACTTTGGGT TGTGGAGTTG 2760 

CTCAGTGCTT GAAGATTGTC TGCCAAGTTG GGAGATTAGA CAGAGGAAAG AGTGCAATCT 2820 

TGTACGTAAA GTCATTACTG TGGACTGAGA CTTTTATGAA TAAAGAAAAT CAGAATCATT 2880 

CCTATTCTCT GAAGTCGTCT GCTTCATTTA ATGTCATAGA GTTTCCTTAT AAGAATCTTC 2940 

CAATTGAGGA TATCACCAAC TCCACATTGG TTACCACTAA TGTCACCTGG GGCATTCAGC 3000 

20 CAGCGCCCAT GCCTGTGCCT GTGTGGGTGA TCATTTTAGC AGTTCTAGCA GGATTGTTGC 3060 

TACTGGCTGT TTTGGTATTT GTAATGTACA GGATGGGCTT TTTTAAACGG GTCCGGCCAC 3120 

CTCAAGAAGA ACAAGAAAGG GAGCAGCTTC AACCTCATGA AAATGGTGAA GGAAA CTCA G 3180 

AAACTTAACT GCAGTTTTTA AGTTATGCTA CATCTTGACC CACTAGAATT AGCAACTTTA 3240 

TTATAGATTT AAACTTTCTT CATGAGGAGT AAAAATCCAA GGCTTTACTG CTGATAGTGC 3300 

25 TAATTGGCAT TAACCACAAA ATGAGAATTA TATTTGTCAA CCTTCTCCTT ATAAATAAGT 3360 

TCAGACATAC ATTTAATAAC ATAGGGTGAC TTGTGTTTTT AGGTATTTAA ATAATAAAAT 3420 

TTCAAGGGAT AGTTTTTATT CAATGTATAT AAGACAGGTA GTGCCTGATT TACTACTTTA 3480 

TATAAAATAG TACCTCCTTC AGTTACTGTT TCTGATTTAA TGTACGGAAC TTTATTTGTT 3540 

GTTGTTGTTG TTGTTGTTT3T TGTTGTTTTA AAGCAGTCCA AATTTGGACC TTAGCAATCA 3600 

30 TGTCTTTTGT ATAGGTACTT AATGTTAATA CATATTACAC TACAGTTTAC TTTTCAGAAT 3660 

ACTAAAGACT TTATAACTGC ATGAACTTGG ATTTTTTTAA TCACTCATAT GGTAGAATTT 3720 

TATAAACACA TACATGATAC CATCCAAATT CTTGCTTTTA ATAACAAAGG TACAATATTT 3780 

TGTTTTAGTA TGAAAATCTG GTAGATCCTA TTACACTTCT GTTTATATTA AATCCACAAT 3840 

ATTTTATTAC ATTTTTAACT TGTATAAATT TTAGGTCAAA TCCTTCAAGC CAACCTATAC 3900 

35 TAAAAATTAG TTCCATAATC ACAAATGGCT CTTTTGTGTA ATTGTTTAAT TTCACCTGAA 3960 

TATCATAATG CTTAAAGCCA TATGGAGTTG GAAATTATTT CCAAAGCATA TTTATTCCAT 4020 

TGTTTTAGTC TGGCTATTTA CAGTATAAAA AAAGCATTTT ATTAAAATAC TGTGTAGTTC 4080 

TTTGAGATAG TTGCTTATGC ATATAGTAAG TATTACATTC TTAGAGTAGA GCAGAGTTTT 4140 

TAGTTAGTAT TAATTTATTT TCCTCCATTC ATGTACTTTT CCTTATATTT CCAAAACTGT 4200 

40 TACTGAGAAT GGGTCAAGAT CAGTGAGAAA TCTTTACAGT TGACAGGAAC CTGGACCCCT 4260 

TACCCCAACT TTATGAGTAA TGCTTGGAAT AAAAAACTCT TAAGGCAACT CACTGATTTA 4320 

CTTCTAGCAA TAGCATGATG TTACAGGAAT ATTACCTCTG TTTAAGCAAG GTAATGTGTA 4380 

AAATCAGTCT CGGCTGTCAG AATAACTTCT AAAAGGTATT TTTATAAGCA GTTCAAGTTA 4440 

CTGAAAACCT TTTAAACCTT TCTGAAGTTC GTTAGTATAA ATTACTTTTC TAGGATTATT 4500 

45 AATAAAAGCC ACATAGGTGG CAAGTTGTAG TTTTATATGG CTCTGTAGAG TGGTGAACCT 4560 

TCTAGAGGAA TATATGATTT ATTCACAGTT CCTCAAGGCC TGGGGATGAT GATCAGTTAT 4620 

ACCTATTTTT GTGCAATTAC ATCATGTTGT ACATTAGAAA TGGAGAGTTT AATAGCTCTT 4680 

TAACTGCTGT CCTCATTAGG TAATGATAAA TATTTCCCTT AAATAATTGA CTATTTTGCT 4740 

GTGTTTTAAA AATGATTGAA ATTTATCTTG CCATATCTCA TAATTTCATG CACAAGTTGA 4800 

50 CTGAGCTAAT CTTGAGAATA TATTCGTAAA ATAGGAGCAC ATTTAGTTGA GGTATACAAG 4860 

GTAGGACTCT AGACAAAACC TTCTATTTTA GCTTTAGTGA ATTTCAAAAG TAATGGGTCT 4920 

TGGAGTATAG ATTTTTATTA GTAGCTTGAA AGAGCTTAAT CATATGCAGT AAGTATTTTT 4980 

ATTACCAATA AATTTAAAAT TTTTTAAGAA AAATATTTTT ATCCTAGGGC CAAGTGTTGC 5040 

CTGCCACCAA TCAGTAAGTT AGTCTATAAC AAATTTTACC CTAACAGTTT TACCACCTAG 5100 

55 CAACAGTCAT TTCTGAAAAT ATGTTGGATA GAAAGTCACT CTTTGGCAAA AGTGTTAGAA 5160 

TTTGCTTTTG TGCCATCTAT TCCTTTTATG GCATCTATCT TGAAAGTAAT CTTGTATTGG 5220 

AGATTGAAAG ATGCTGTAAT TTAGAAATTA ACATGATATC TTAAATTACC TTTATGAAAT 5280 

ATAGTTTTGT ATAATAGCAT AGATTTTCCT TCAAAAAATG AACATTTATA TATCTACAAA 5340 

AATATGGAGA AGAGCAATTT GAAAGCCTAC TTTCTGAAGA AAATGGTGGG ATTTTTTTTT 5400 

60 ATCATGATTA AATATCAAAA AATTGCCCTA TGAAAACTTT AAATCTCTAA AACATTTGAA 5460 

ATACTACCAT ATTTGTGATT TATTGAGAAT AAAAATCCAT TTTGAAATGT AAAATTTTTA 5520 

TGATCTGATT CAGTTTTAAG AAAACATGAA TGAACTAGAA GATATTAAAA ACATTTGACA 5580 

TTGGTAAGAA ATATTGATAC TGATATTGAT TTTTATATAG GTATTTATTT CAGAATTGAT 5640 

ATTTTGAGAA AAATACATGT GAGTCATTTT TTCTGTTTCT CTTTTCTCTT AACGATTATC 5700 

65 ACTGTAATTC TGAATCT 

Seq ID NO: 189 Protein sequence; 
Protein Accession #: NP_002201.1 

70 1 11 21 31 41 51 

I I I 1 1 I 

MAFPPRRRLR LGPRGLPLLL SGLLLPIjCRA FNLDVDSPAE YSGPEGSYFG FAVDFFVPSA 60 

SSRMFLLVGA PKANTTQPGI VBGGQVLKCD WSSTRRCQPI BFDATGNRDY AKDDPLEFKS 120 

HQKFGASVRS KQDKILACAP LYHWRTEMKQ EREPVGTCFL QDGTKTVEYA PCRSQDIDAD 180 

75 GQGFCQGGFS IDFTKADRVL LGGPGSFYWQ GQLISDQVAE IVSKYDPNVY SIKYNNQLAT 240 

RTAQAIPDDS YLGYSVAVGD FNGDGIDDFV SGVPRAARTL GMVYIYDGKN MSSLYNFTGE 300 



269 



WO 02/079492 



PCT/US02/04915 



QMAAYFGFSV 
LNGFEVFARF 
LEGQWAARSM 
PSILNQDNKT 
ALFLYSRSPS 
TTGIiQPILNQ 
AQNQGEGAYE 
TQLLAGLRFS 
DHIFLPIPNW 
ILHYDIDGPM 
TLGCGVAQCL 
FPYKNLPIED 
FKKVRPPQEE 



AATDINGDDY 
GSAIAPLGDL 
PPSFGYSMKG 
CSLPGTALKV 
HSKNMTISRG 
PTPANISRQA 
AELIVSIPLQ 
VHQQSEMDTS 



NCTSDMEINP 
KIVCQVGRLD 
ITNSTLVTTN 
QEREQLQPHE 



ADVFIGAPIiP 
DQDGFNDIAI 
ATDXDKN6YP 
SCFNVRFCLK 
GLMQCEELIA 
HTIiTiDCGEPN 
ADFIGWRNN 
VKFDLQIQSS 
DVGPWQHIY 
LRIKISSLQT 
RGKSAILYVK 
VTWGIQPAPM 
NGEGNSET 



MDRGSDGKLQ 
AAPYGGEDKK 
DIiIVGAFGVD 
ADGKGVLPRK 
YLRDESEFRD 
VCKPKLEVSV 
EALARLSCAF 
NLFDKVSPW 
ELRNNGPSSF 
TEKNDTVAGQ 
SliLWTETFMN 
PVPVWVIILA 



EVGQVSVSLQ 
GIVYIPNGRS 
RAILYRARPV 
LNFQVELLltD 
KLTPITIFME 
DSDQKKIYIG 
KTENQTRQW 
SHKVDLAVLA 
SKAMLHLQWP 
GERDHLITKR 
KENQNHSYSL 
VLAGLLLLAV 



RASGDFQTTK 
TGIiNAVPSQI 
ITVNAGLEVY 
KLKQKGAIRR 
YRItDYRTAAD 
DDNPLTLIVK 
CDLGNPMKAG 
AVEIRGVSSP 
YKYNNNTLLY 
DLALSEGDIH 
KSSASFNV2E 
LVFVMYRMGF 



360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 



Seq ID NO: 190 DNA sequence 
Nucleic Acid Accession #: NM_004864 

Coding sequence: 26-952 (underlined sequences correspond to start and stop codons) 



CGGAACGAGG 
TCAGATGCTC 
GGCCGAGGCG 
ATTCCGAGAG 
CTGGGAAGAT 
AGTGCGGCTG 
GGGGCTCCCC 
AAGGTCGTGG 
GCCCGCGCTG 
ATCTTCGTCC 
CCGCAGAGCG 
TCTGCACACG 
ACGGGAGGTG 
CATGCACGCG 
CTGCTGCGTG 
GTCGCTCCAG 
GGTCCTTCCA 
GGGCTCAAGG 
TTATTTATTA 
ACTGTGTATT 
AAAA 



11 
I 

GCAACCTGCA 
CTGGTGTTGC 
AGCCGCGCAA 
TTGCGGAAAC 
TCGAACACCG 
GGATCCGGCG 
GAGGCCTCCC 
GACGTGACAC 
CACCTGCGAC 
GCACGGCCCC 
CGTGCGCGCA 
GTCCGCGCGT 
CAAGTGACCA 
CAGATCAAGA 
CCCGCCAGCT 
ACCTATGATG 
CTGTGCACCT 
TTCCTGAGAC 
TTAATTTATT 
TATTTAAAAC 



21 
I 

CAGCCATGCC 
TGGTGCTCTC 
GTTTCCCGGG 
GCTACGAGGA 
ACCTCGTCCC 
GCCACCTGCA 
GCCTTCACCG 
GACCGCTGCG 
TGTCGCCGCC 
AGCTGGAGTT 
ACGGGGACGA 
CGCTGGAAGA 
TGTGCATCGG 
CGAGCCTGCA 
ACAATCCCAT 
ACTTGTTAGC 
GCGCGGGGGA 
ACCCGATTCC 
GGGGTGACCT 
TCTGGTGATA 



31 
I 

CGGGCAAGAA 
GTGGCTGCCG 
ACCCTCAGAG 
CCTGCTAACC 
GGCCCCTGCA 
CCTGCGTATC 
GGCTCTGTTC 
GCGTCAGCTC 
GCCGTCGCAG 
GCACTTGCGG 
CTGTCCGCTC 
CCTGGGCTGG 
CGCGTGCCCG 
CGGCCTGAAG 
GGTGCTCATT 
CAAAGACTGC 
GGCGACCTCA 
TGCCCAAACA 
TCTTGGGGAC 
AAAATAAAGC 



41 



51 
I 



CTCAGGACGG TGAATGGCTC 
CATGGGGGCG CCCTGTCTCT 
TTGCACTCCG AAGACTCCAG 
AGGCTGCGGG CCAACCAGAG 
GTCCGGATAC TCACGCCAGA 
TCTCGGGCCG CCCTTCCCGA 
CGGCTGTCCC CGACGGCGTC 
AGCCTTGCAA GACCCCAAGC 
TCGGACCAAC TGCTGGCAGA 
CCGCAAGCCG CCAGGGGGCG 
GGGCCCGGGC GTTGCTGCCG 
GCCGATTGGG TGCTGTCGCC 
AGCCAGTTCC GGGCGGCAAA 
CCCGACACGG AGCCAGCGCC 
CAAAAGACCG ACACCGGGGT 
CACTGCAT AT GA GCAGTCCT 
GTTGTCCTGC CCTGTGGAAT 
GCTGTATTTA TATAAGTCTG 
TCGGGGGCTG GTCTGATGGA 
TGTCTGAACT GTTAAAAAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



Seq ID NO: 191 Protein sequence: 
Protein Accession #: NP 004855 



MPGQELRTVN 
EDLLTRLRAN 
HRALFRLSPT 
ELHLRPQAAR 
IGACPSQFRA 
IiAKDCHCI 



11 21 31 41 51 

I I I I I 

GSQMLLVLLV LSWLPHGGAL SLAEASRASF PGPSELHSED SRFRELRKRY 60 

QSWEDSNTDL VPAPAVRILT PEVRLGSGGH imRISRAAL PBGLPEASRL 120 

ASRSWDVTRP LRRQLSLARP QAPALHLRLS PPPSQSDQLL AESSSARPQL 180 

GRRRARARNG DDCPLGPGRC CRLHTVRASL EDLGWADWVL SPRBVQVTMC 240 

ANMHAQIKTS LHRLKPDTEP APCCVPASYN PMVLIQKTDT GVSLQTYDDL 300 



Seq ID NO: 192 DNA sequence 

Nucleic Acid Accession #: XM_061731.1 

Coding sequence: 1-567 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



ATGAGAAAAG 
AGAAAGTTTC 
GTAAAAGAAC 
CTGTTGTTGA 
TCCCCTCCAA 
TGGGGTGGCA 
AGTCTTTCAC 
GAGGTCTTCC 
GCTGGATGGT 
ATGGTTTATT 



GAAATGAGGG 
TCAAAGAAGA 
CTTTCTCTCT 
TGTCCACAGA 
TGTGCACCAA 
AAGACACCAG 
TCACCAAACA 
AGCCACTTTC 
GGATTTATCA 
CTAAAGAAAC 



AGAGAACACA 
TGGCATTACA 
GATTGGACTT 
CACTGGCAAG 
ATCACGTAAA 
GAGCAATACT 
TTCCCACAAG 
AGAGCCAGGT 
GAGCTGTCAG 
TGAGTGA 



GAAGAGGGCA 
TTGCACATCT 
GACACACAGA 
GACAGGTTTA 
AATGGGGATA 
GATCTTCCTA 
CCTGTCCCTG 
GTAGAAGCAG 
GTTCCTTCCT 



GGCTTGCTCA 
CTCTGTGTCT 
AGGATCTCAG 
CCAACATACT 
ATGACTCCCC 
TCAGAGACCC 
AGCATCAGTG 
AGATGGAAGT 
CAACCCTTGC 



GCTTGCTCAA 
CTCTATTGCT 
TAAAGATTTG 
GCTGTCACAC 
TGCCTTCACA 
TGGGGGCAAG 
TGACCAGAGA 
GTTCGCTGAT 
AAGAAAGAAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 



Seq ID NO: 193 Protein sequence: 
Protein Accession &: XP 061731.1 



11 
I 



21 



31 



41 



51 
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MRKGNBGENT EBGRLAQLAQ RKFLKEDGIT LHISLCliSIA VKEPFSLIGL DTQKDLSKDL 60 

LLLMSTDTGK DRFTNILLSH SPPMCTKSRK NGDNDSPAFT WGGKDTRSNT DLPIRDPGGK 120 

SLSLTKHSHK PVPEHQCDQR EVFQPLSEPG VEAEMEVFAD AGWWIYQSCQ VPSSTLARKK 180 
MVYSKETE 



Seq ID NO: 194 DNA sequence 

Nucleic Acid Accession #: NM_005415.2 

Coding sequence: 371-2410 (underlined sequences correspond to start and stop codons) 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



70 



75 



GAGCTGTCCC 
CGCCTCGATC 
TTCTCTTCTC 
CCAGACAACA 
CGTAGTTTAC 
CTCATATTCT 
ACTCCAGAGA 
TTTGGTGGAC 
CTCCGTGGGA 
GACCCTGAAG 
GGGGGCCAAA 
GACTCAAGGG 
ACTCGTGGCT 
TATTGGTTTC 
AATTGTGATG 
CTTCCTGGTT 
TTTGCCAGTT 
AGCACCGTTG 
ATGTGCAGTT 
AATTGAACGA 
CTTGAAAGAA 
TGTTTCTGAG 
CTCATTCAAA 
CTTGAAAGAG 
GAACCTTGTC 
GTATCACACC 
TGCCAAGGTG 
CTATACTTCC 
AGGTGAACAG 
GCGAATTCGA 
ATCTGAGATA 
TGGCTCTCTA 
CCTGCAGATC 
TGCCATTGGG 
AGTGGCAACA 
GGTTTGGGGA 
TAGTGGCTTC 
CCTTCCCATC 
GTCCAAGAAG 
AGTCCCCATT 
CAGAATGTGA 
TCCTGCTCCC 
TGGGAGCAGA 
GTCTCAAAAT 
TCCTTCTGGG 
CATTATGTTT 
CATGAAGAGC 
ACATGCACAG 
AAGTAGAGTC 
GGGAGCTTCT 
TTTGCAAGCA 
GCAATCTTGG 
GTGAACTTTG 
GAATAAAAAA 



11 
I 

CGGTGCCGCC 
TCCTCGTCTC 
CGCCATGGAA 
GATGCCCATA 
AGTATTTAAT 
GTTTACACAT 
ATGGCAACGC 
TACCTATGGA 
GCCAATGATG 
CAAGCCTGCA 
GTGAGCGAAA 
CTACTGATGG 
TCGTTTTTGA 
TCCCTCGTGG 
TCTTGGTTOG 
CGTGCATTCA 
TTCTATGCCT 
CTGGGCTTTG 
TTCTGTGCCC 
GAAATAAAGT 
GACCATGAAG 
GTAGGGCCTG 
CTTGGAGATT 
GAAACCAGCA 
CAGTTCAGTC 
GTGCATAAGG 
GGAGATTGCA 
TATACCATGG 
AAGGGCGAAG 
ATGGACAGTT 
GACATGAGTG 
GAAGAATGGT 
CTTACAGCCT 
CCTCTGGTTG 
CCAATATGGC 
AGAAGAGTTA 
AGTATTGAAC 
AGTACAACAC 
GCTGTTGACT 
TCTGGAGTTA 
AGCTGTTTGA 
CTGAAGAATG 
GGAGGGAAGT 
TAGCTGTGTA 
CTGTGAATTC 
TAATG1TGTC 
CGTTTGACAG 
GGATTTAACA 
CTTGGTACTC 
TAGA GGGATG 
GTTTATTGAC 
TTATTTCTTT 
GGCAAGTTAA 
GCCTACAGTT 



21 

I 

GACCCGGGCC 
CCGCTCCGCC 
TTCTGCTCCG 
CGCAGCGTAT 
TTTATATAAT 
CTTGAAAGGC 
TGATTACCAG 
TGCTCATCCT 
TAGCAAATTC 
TCCTAGCTAG 
CCATCCGGAA 
CCGGCTCAGT 
AGCTCCCTAT 
CAAAGGGGCA 
TGTCCCCACT 
TCCTCCATAA 
GCACAGTTGG 
ACAAACTTCC 
TTATCGTCTG 
GTAGTCCTTC 
AAACAAAGTT 
CCACTGTGCC 
TGGAGGAAGC 
TAGATAGCAC 
AAGCCGTCAG 
ATTCCGGCCT 
TGGGAGACTC 
CAATATGTGG 
AAATGGAGAA 
ACACCAGTTA 
TCAAGGCAGC 
ATGACCAGGA 
GCTTTGGGTC 
CTTTATATTT 
TTCTACTCTA 
TCCAGACCAT 
TGGCATCTGC 
ATTGTAAAGT 
GGCGTCTCTT 
TCAGTGCTGC 
GATTAAAATT 
ATTACAGTGT 
GTTACTTGTG 
AAATAGCCCG 
CTGTACATAT 
TCTGAAGATG 
AGCATGCTCT 
ACAAAAATAT 
TGCCCTCCTG 
AGGTTCTTTG 
TGTTATTGCT 
AAGATTTCTG 
ATGGGACAGC 
TTTAGAAAAA 



31 
I 

GTGCCGTGTG 
CTCCCTTTTC 
TGCTTTTAGC 
AGCAGTAACT 
ATATATTATT 
GCTCAGTAGT 
TACTACAGCT 
GGGCTTCATT 
TTTTGGTACA 
CATCTTTGAA 
GGGCTTGATT 
CAGTGCTATG 
TTCTGGAACC 
GGAGGGTGTC 
GCTTTCTGGA 
GGCAGATCCA 
AATAAACCTC 
TCTGTGGGGT 
GTTCTTTGTA 
TGAAAGCCCC 
GTCTGTTGGT 
CCTCCAGGCT 
TCCAGAGAGA 
CGTGAATGGT 
CAACCAAATA 
GTACAAAGAG 
CGGTGACAAA 
CATGCCTCTG 
GCTGACATGG 
CTGCAATGCT 
GATGGGTCTA 
TAAGCCTGAA 
ATTCGCCCAT 
GGTTTATGAC 
TGGTGGTGTT 
GGGGAAGGAT 
CCTCACTGTG 
GGGCTCTGTT 
TCGTAACATT 
CATCATGGCA 
TGTGTCAATG 
TAACAGAAGA 
CTATAACTGC 
GGTTCCACTG 
TTCTCTACTT 
ACTTGTGATT 
GCGTTGTTGG 
AACTACAACT 
TCAGTAGTGG 
AACACAGTGA 
AAGAAGAAGT 
GCAGTGTGGG 
CTTCCATGTT 
ACCCGAATTC 



41 
I 

CCCGTGGCTC 
CCTGGATGAA 
CCTCCTGAGC 
CCCCAGCTCG 
TATTATAGCA 
TCTCTTACTA 
GCTACCGCCG 
ATTGCATTTG 
GCTGTGGGCT 
ACAGTGGGCT 
GACGTGGAGA 
TTTGGTTCTG 
CATTGTATTG 
AAGTGGTCTG 
ATTATGTCTG 
GTTCCTAATG 
TTTTCCATCA 
ACCATCCTCA 
TGTCCCAGGA 
TTAATGGAAA 
GATATTGAAA 
GTGGTGGAGG 
GAGAGGCTTC 
GCAGTGCAGT 
AACTCCAGTG 
CTACTCCATA 
CCCTTAAGGC 
GATTCATTCC 
CCTAATGCAG 
GTGTCTGACC 
GGTGACAGAA 
GTCTCTCTCC 
GGTGGCAATG 
ACAGGAGATG 
GGTATCTGTG 
CTGACACCGA 
GTGATTGCAT 
GTGTCTGTTG 
TTTATGGCCT 
ATCTTCAGAT 
TTTGGGACCA 
CTGACAAGAG 
TTTTGTGCTA 
GCTCCTGCTG 
TTTGTATCAG 
TTTTTTTCTT 
TTTCACCAGC 
TCCCTTGTAG 
CAGGATCTAT 
AAATTTAAAT 
AAGAAAGAAA 
ATGGATGAAT 
CATTTGTCTA 



51 
I 

CAGCCGCTGC 
CTTGCGTCCT 
CAAAGAAACC 
GTTTCTGTGC 
TTTTTGATAC 
AACAACCACT 
CTTCTGGTCC 
TCTTGGCATT 
CAGGTGTAGT 
CTGTCTTACT 
TGTACAACTC 
CTGTGTGGCA 
TTGGTGCAAC 
AACTGATAAA 
GAATTTTATT 
GTTTGCGAGC 
TGTATACTGG 
TCTCGGTGGG 
TGAAGAGAAA 
AAAAGAATAG 
ACAAGCATCC 
AGAGAACAGT 
CCAGCGTGGA 
TGCCTAATGG 
GCCACTCCCA 
AATTACATCT 
GCAATAATAG 
GTGCCAAAGA 
ACTCCAAGAA 
TTCACTCAGC 
AAGGAAGTAA 
TCTTCCAGTT 
ACGTAAGCAA 
TTTCTTCAAA 
TTGGTCTGTG 
TCACACCCTC 
CAAATATTGG 
GCTGGCTCCG 
GGTTTGTCAC 
ATGTCATCCT 
TCTTAGGTAT 
TCTTTTTATT 
AATATGAATT 
AGGTCCCCTT 
GCTTCAATTC 
TTTTTTAAAC 
TTCTGCCCTC 
TCTCTTATAT 
TGGCATATTC 
TAGTAACTTT 
AAGCCTGTTG 
GAAGTGGAAT 
CCTCTTAACT 



60 
120 
180 
240 
300 
360 
420' 
480 
540 
600 
660 
720 
780 
B40 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1SO0 
1560 
1620 
1680 
1740 
1800 
1660 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 



Seq ID NO: 195 P-rnt-oi™ sequence: 
Protein Accession #: NP_005406.2 

1 11 21 31 41 51 

I I I I I I 

MATLITSTTA ATAASGPLVD YIiWMLILGPI IAFVLAFSVG ANDVANSPGT AVGSGWTLK 

QACILASIFE TVGSVLLGAK VSETIRKGLI DVEMYNSTQG LLMAGSVSAM FGSAVWQLVA 

SFLKLPISGT HCIVGATIGP SLVAKGQEGV KWSELIKIVM SWFVSPLLSG IKSGILFFLV 
RAFILHKADP VPNGLRALPV FYACTVGINL FSIMYTGAPL LGFDKLPLWG TILISVGCAV 

FCAL1VWFFV CPRMKRKIER EIKCSPSESP LWEKKNSLKE DHEETKLSVG DIENKHPVSE 



60 
120 
180 
240 
300 
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VGPATVPLQA WEERTVSFK LGDLEEAPER ERLPSVDLKE ETSIDSTVNG AVQLPNGNLV 360 

QFSQAVSNQI NSSGHSQYHT VHKDSGLYKE LLHXLHLAKV GDCMGDSGDK PLRRNNSYTS 420 

YTMAICGMPL DSFRAKEGEQ KGEEMBKLTW PNADSKKRIR KDSYTSYCKA VSDLHSASEI 480 

DMSVKAAMGL GDRKGSNGSL EEWYBQDKPE VSLLPQFLQI LTACFGSFAH GGNDVSNAIG 540 

PLVALYLVYD TGDVSSKVAT PIWLLLYGGV GICVGLWVWG RRVIQTMGKD LTPITPSSGF 600 

SIELASALTV VIASNIGLPI STTHCKVGSV VSVGWLRSKK AVDWRLFRNI FMAWFVTVPI 660 
SGVISAAIMA IPRYVTLRM 



Seq ID NO: 196 PNA sequence 

Nucleic Acid Accession #: NM_000020.1 

Coding sequence: 283-1794 (underlined sequences correspond to start and stop codons) 



AGGAAACGGT 
AGAAACATTT 
GAGCGAGCCC 
CCAGCGCTGG 
AGGCTAGCGC 
AGGAAAGGCC 
TCTCGGGGCC 
CGGGGGGCCT 
CGGG6CTGC6 
CACTACTGCT 
CAACCTCCTT 
CTGGCCTTGC 
CAGGAGAAGC 
TCTGAGCAGG 
GGCTCAGGGC 
TGTGTGGGAA 
GCCGTCAAGA 
AACACAGTAT 
CGCAACTCGA 
GACTTTCTGC 
GCATGCGGCC 
GCCCACCGCG 
GCCGACCTGG 
AACCCGAGAG 
ACGGACTGCT 
GAGATTGCCC 
GATGTGGTGC 
CAGACCCCCA 
ATGATGCGGG 
AAGACACTAC 
AGCACCTGAT 
CTATCTGGGT 
TGCTCGGCCC 



11 

I 

TTATTAGGAG 
TTGCTCCAGC 
CTCCCCGGCT 
CGGTGCAACT 
CCCGCCACCC 
TTCTGATGCT 
CGCTGGTGAC 
GGTGCACAGT 
GGAACTTGCA 
GCGACAGCCA 
CGGAGCAGCC 
TGGCCCTGGT 
AGCGTGGCCT 
GCGACACGAT 
TCCCCTTCCT 
AAGGCCGCTA 
TCTTCTCCTC 
TGCTCAGACA 
GCACGCAGCT 
AGAGACAGAC 
TGGCGCACCT 
ACTTCAAGAG 

TGGGCACCAA 
TTGAGTCCTA 
GCCGGACCAT 
CCAATGACCC 
CCATCCCTAA 
AGTGCTGGTA 
AAAAAATTAG 
TCCTTTCTGC 
AGAGGTAGTG 
CCAGCCCACC 



21 

I 

GGAGTGGTGG 
CCCCATCCCA 
CCAGCCCGGT 
GGGGCCGCGC 
GCAGAGCGGG 
GCTGATGGCC 
CTGCACGTGT 
AGTGCTGGTG 
CAGGGAGCTC 
CCTCTGCAAC 
GGGAACAGAT 
GGCCCTGGGT 
GCACAGCGAG 
GTTGGGGGAC 
GGTGCAGAGG 
TGGCGAAGTG 
GAGGGATGAA 
CGACAACATC 
GTGGCTCATC 
GCTGGAGCCC 
GCACGTGGAG 
CCGCAATGTG 
GATGCACTCA 
GCGGTACATG 
CAAGTGGACT 
CGTGAATGGC 
CAGCTTTGAG 
CCGGCTGGCT 
CCCAAACCCC 
CAACAGTCCA 
CTGCAGGGGG 
TGAGTGTGGT 
CAGCCAAAAA 



31 
I 

AGCTGGGCCA 
GTCCCGGGAG 
CCGGGGCCGC 
GGTGGAGGGG 
CCCAGAGGGA 
TTGGTGACCC 
GAGAGCCCAC 
CGGGAGGAGG 
TGCAGGGGGC 
CACAACGTGT 
GGCCAGCTGG 
GTCCTGGGCC 
CTGGGAGAGT 
CTCCTGGACA 
ACAGTGGCAC 
TGGCGGGGCT 
CAGTCCTGGT 
CTAGGCTTCA 
AOGCACTACC 
CATCTGGCTC 
ATCTTCGGTA 
CTGGTCAAGA 
CAGGGCAGCG 
GCACCCGAGG 
GACATCTGGG 
ATCGTGGAGG 
GACATGAAGA 
GCAGACCCGG 
TCTGCCCGAC 
GAGAAGCCTA 
CTGGGGGGGT 
GTGTGCTGGG 
TACAGCTGGG 



41 
I 

GGCAGGAAGA 
GCTGCCGCGC 
GCCGGACCCC 
AGGTGGCCCC 
CCATGACCTT 
AGGGAGACCC 
ATTGCAAGGG 
GGAGGCACCC 
GCCCCACCGA 
CCCTGGTGCT 
CCCTGATCCT 
TGTGGCATGT 
CCAGTCTCAT 
GTGACTGCAC 
GGCAGGTTGC 
TGTGGCACGG 
TCCGGGAGAC 
TCGCCTCAGA 
ACGAGCACGG 
TGAGGCTAGC 
CACAGGGCAA 
GCAACCTGCA 
ATTACCTGGA 
TGCTGGACGA 
CCTTTGGCCT 
ACTATAGACC 
AGGTGGTGTG 
TCCTCTCAGG 
TCACCGCGCT 
AAGTGATTCA 
GGGGGGCAGT 
GATGGGCAGC 
CTGAAACCTG 



51 
I 

CGCTGGAATA 
CAGCTGCGCC 
AGCCCGCCGT 
GGTCCGCCGA 
GGGCTCCCCC 
TGTGAAGCCG 
GCCTACCTGC 
CCAGGAACAT 
GTTCGTCAAC 
GGAGGCCACC 
GGGCCCCGTG 
CCGACGGAGG 
CCTGAAAGCA 
CACAGGGAGT 
CTTGGTGGAG 
TGAGAGTGTG 
TGAGATCTAT 
CATGACCTCC 
CTCCCTCTAC 
TGTGTCCGOG 
ACCAGCCATT 
GTGTTGCATC 
CATCGGCAAC 
GCAGATCCGC 
GGTGCTGTGG 
ACCCTTCTAT 
TGTGGATCAG 
CCTAGCTCAG 
GCGGATCAAG 
ATAGCCCAGG 
GGATGGTGCC 
TGCGCCTGCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 



Seq ID NO: 197 Protein sequence: 
Protein Accession #: NP 000011.1 



1 
I 

MTLGSPRKGL 
RHPQEHRGCG 
LILGPVLALL 
DCTTGSGSGL 
RBTEIYNTVL 
RLAVSAACGL 
YLDIGNNPRV 
YRPPFYDWP 
TALRIKKTLQ 



11 

I 

LMLLMALVTQ 
NLHRBLCRGR 
ALVALGVLGL 
PFLVQRTVAR 
LRHDNILGFI 
AHLHVEIFGT 
GTKRYMAPEV 
NDPSFEDMKK 
KISNSPEKPK 



21 
I 

GDPVKPSRGP 
PTEFVNHYCC 
WHVRRRQEKQ 
QVALVECVGK 
ASDMTSRNSS 
QGKPAIAHRD 
LDEQIRTDCF 
WCVDQQTPT 
VIQ 



31 
I 

LVTCTCESPH 
DSHLCNHNVS 
RGLHSELGBS 
GRYGEVWRGL 
TQLWLITHYH 
FKSRNVLVKS 
ESYKWTDIWA 
IPNRLAADPV 



41 
I 

CKGPTCRGAW 
LVLEATQPPS 
SLILKASEQG 
WHGESVAVKI 
EHGSLYDFLQ 
NLQCCIADLG 
FGLVLWEIAR 
LSGLAQMMRE 



51 
I 

CTWLVREEG 
BQPGTDGQLA 
DTMLGDLLDS 
FSSRDEQSWF 
RQTLEPHLAL 
LAVMHSQGSD 
RTIVNGIVED 
CWYPNPSARL 



60 
120 
180 
240 
300 
360 
420 
480 



Seq ID NO: 198 PNA sequence 

Nucleic Acid Accession #: NM_003199.1 

Coding sequence: 200-2203 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

CGGGGGGATC TTGGCTGTGT GTCTGCGGAT CTGTAGTGGC GGCGGCGGCG GCGGCGGCGG 60 

GGAGGCAGCA GGCGCGGGAG CGGGCGCAGG AGCAGGCGGC GGOGGTGGCG GCGGCGGTTA 120 

GACATGAACG CCGCCTCGGC GCCGGCGGTG CACGGAGAGC CCCTTCTCGC GCGOGGGCGG 180 

TTTGTGTGAT TTTGCTAA AA TG CATCACCA ACAGCGAATG GCTGCCTTAG GGACGGACAA 240 

AGAGCTGAGT GATTTACTGG ATTTCAGTGC GATGTTTTCA CCTCCTGTGA GCAGTGGGAA 300 

AAATGGACCA ACTTCTTTGG CAAGTGGACA TTTTACTGGC TCAAATGTAG AAGACAGAAG 360 

TAGCTCAGGG TCCTGGGGGA ATGGAGGACA TCCAAGCCCG TCCAGGAACT ATGGAGATGG 420 
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GACTCCCTAT GACCACATGA CCAGCAGGGA CCTTGGGTCA CATGACAATC TCTCTCCACC 480 

TTTTGTCAAT TCCAGAATAC AAAGTAAAAC AGAAAGGGGC TCATACTCAT CTTATGGGAG 540 

AGAATCAAAC TTACAGGGTT GCCACCAGCA GAGTCTCCTT GGAGGTGACA TGGATATGGG 600 

CAACCCAGGA ACCCTTTCGC CCACCAAACC TGGTTCCCAG TACTATCAGT ATTCTAGCAA 660 

TAATCCCCGA AGGAGGCCTC TTCACAGTAG TGCCATGGAG GTACAGACAA AGAAAGTTCG 720 

AAAAGTTCCT CCAGGTTTGC CATCTTCAGT CTATGCTCCA TCAGCAAGCA CTGCCGACTA 780 

CAATAGGGAC TCGCCAGGCT ATCCTTCCTC CAAACCAGCA ACCAGCACTT TCCCTAGCTC 840 

CTTCTTCATG CAAGATGGCC ATCACAGCAG TGACCCTTGG AGCTCCTCCA GTGGGATGAA 900 

TCAGCCTGGC TATGCAGGAA TGTTGGGCAA CTCTTCTCAT ATTCCACAGT CCAGCAGCTA 960 

CTGTAGCCTG CATCCACATG AACGTTTGAG CTATCCATCA CACTCCTCAG CAGACATCAA 102 0 

TTCCAGTCTT CCTCCGATGT CCACTTTCCA TCGTAGTGGT ACAAACCATT ACAGCACCTC 1080 

TTCCTGTACG CCTCCTGCCA ACGGGACAGA CAGTATAATG GCAAATAGAG GAAGCGGGGC 1140 

AGCCGGCAGC TCCCAGACTG GAGATGCTCT GGGGAAAGCA CTTGCTTCGA TCTATTCTCC 1200 

AGATCACACT AACAACAGCT TTTCATCAAA CCCTTCAACT CCTGTTGGCT CTCCTCCATC 1260 

TCTCTCAGCA GGCACAGCTG TTTGGTCTAG AAATGGAGGA CAGGCCTCAT CGTCTCCTAA 1320 

TTATGAAGGA CCCTTACACT CTTTGCAAAG CCGAATTGAA GATCGTTTAG AAAGACTGGA 1380 

TGATGCTATT CATGTTCTCC GGAACCATGC AGTGGGCCCA TCCACAGCTA TGCCTGGTGG 1440 

TCATGGGGAC ATGCATGGAA TCATTGGACC TTCTCATAAT GGAGCCATGG GTGGTCTGGG 1500 

CTCAGGGTAT GGAACCGGCC TTCTTTCAGC CAACAGACAT TCACTCATGG TGGGGACCCA 1560 

TCGTGAAGAT GGCGTGGCCC TGAGAGGCAG CCATTCTCTT CTGCCAAACC AGGTTCCGGT 1620 

TCCACAGCTT CCTGTCCAGT CTGCGACTTC CCCTGACCTG AACCCACCCC AGGACCCTTA 1680 

CAGAGGCATG CCACCAGGAC TACAGGGGCA GAGTGTCTCC TCTGGCAGCT CTGAGATCAA 1740 

ATCCGATGAC GAGGGTGATG AGAACCTGCA AGACACGAAA TCTTCGGAGG ACAAGAAATT 1800 

AGATGACGAC AAGAAGGATA TCAAATCAAT TACTAGCAAT AATGACGATG AGGACCTGAC 1860 

ACCAGAGCAG AAGGCAGAGC GTGAGAAGGA GCGGAGGATG GCCAACAATG CCCGAGAGCG 1920 

TCTGCGGGTC CGTGACATCA AOGAGGCTTT CAAAGAGCTC GGCOGCATGG TGCAGCTCCA 1980 

CCTCAAGAGT GACAAGCCCC AGACCAAGCT CCTGATCCTC CACCAGGCGG TGGCCGTCAT 2040 

CCTCAGTCTG GAGCAGCAAG TCCGAGAAAG GAATCTGAAT CCGAAAGCTG CGTGTCTGAA 2100 

AAGAAGGGAG GAAGAGAAGG TGTCCTCGGA GCCTCCCCCT CTCTCCTTGG CCGGCCCACA 2160 

CCCTGGAATG GGAGACGCAT CGAATCACAT GGGACAGATG TAAAAGGGTC CAAGTTGCCA 2220 

CATTGCTTCA TTAAAACAAG AGACCACTTC CTTAACAGCT GTATTATCTT AAACCCACAT 2280 

AAACACTTCT CCTTAACCCC CATTTTTGTA ATATAAGACA AGTCTGAGTA GTTATGAATC 2340 

GCAGACGCAA GAGGTTTCAG CATTCCCAAT TATCAAAAAA CAGAAAAACA AAAAAAAGAA 2400 

AGAAAAAAGT GCAACTTGAG GGACGACTTT CTTTAACATA TCATTCAGAA TGTGCAAAGC 2460 
AGTATGTACA GGCTGAGACA CAGCCCAGAG ACTGAACGGC 



Seq ID NO: 199 Protein sequence t 
Protein Accession #: NP 003190.1 



1 11 21 31 41 51 

.11111! 

MHHQQRMAAL GTDKEIiSDLI* DFSAMFSPPV SSGKNGPTSL ASGHFTGSNV EDRSSSGSWG 60 

NGGHPSPSRN YGDGTPYDHM TSRDLGSHDN LSPPFVNSRI QSKTERGSYS SYGRESNLQG 120 

CHQQSIiLGGD MDMGNPGTLS PTKPGSQYYQ YSSNNPRRRP LHSSAMEVQT KKVRKVPPGL 180 

PSSVYAPSAS TADYNRDSPG YPSSKPATST FPSSFFMQDG HHSSDPWSSS SGMNQPGYAG 240 

MLGNSSHIPQ SSSYCSLHPH ERLSYPSHSS ADINSSLPPM STFHRSGTNH YSTSSCTPPA 300 

NGTDSIMANR GSGAAGSSQT GDALGKALAS IYSPDHTNNS FSSNPSTPVG SPPSLSAGTA 360 

VWSRNGGOAS SSPNYEGPLH SLQSRIEDRL ERLDDAIHVL RNHAVGPSTA MPGGHGDMHG 420 

IIGPSHNGAM GGLGSGYGTG LIiSANRHSLM VGTHREDGVA LRGSH5LLPN QVPVPQLPVQ 480 

SATSPDLNPP QDPYRGMPPG LQGQSVSSGS SEIKSDDEGD ENLQDTKSSE DKKLDDDKKD 540 

IKSITSNNDD EDLTPEQKAE REKERRMANN ARERLRVRDI NEAFKELGRM VQLHLKSDKP 600 

QTKLLIIiHQA VAVILSLEQQ VRERNLNPKA ACLKRREBEK VSSEPPPLSL AGPHPGMGDA 660 
S2JHMGQM 



Seq ID NO: 200 DMA sequence 

Nucleic Acid Accession #: BC005987 (1-1286), BE888744 (1287-1756) 

Coding sequence: 124-525 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I ! ! I 

GGCAGAAGAG GAAGATTTCT GAAGAGTGCA GCTGCCTGAA CCGAGCCCTG CCGAACAGCT 60 

GAGAATTGCA CTGCAAC CAT GA GTGAGAAC AATAAGAATT CCTTGGAGAG CAGCCTACGG 120 

CAACTAAAAT GCCATTTCAC CTGGAACTTG ATGGAGGGAG AAAACTCCTT GGATGATTTT 180 

GAAGACAAAG TATTTTACCG GACTGAGTTT CAGAATCGTG AATTCAAAGC CACAATGTGC 240 

AACCTACTGG CCTATCTAAA GCACCTCAAA GGGCAAAACG AGGCAGCCCT GGAATGCTTA 300 

CGTAAAGCTG AAGAGTTAAT CCAGCAAGAG CATGCTGACC AGGCAGAAAT CAGAAGTCTG 360 

GTCACCTGGG GAAACTATGC CTGGGTCTAC TATCACATGG GCCGACTCTC AGACGTTCAG 420 

ATTTATGTAG ACAAGGTGAA ACATGTCTGT GAGAAGTTTT CCAGTCCCTA TAGAATTGAG 480 

AGTCCAGAGC TTGACTGTGA GGAAGGGTGG ACACGGTTAA AGTGTGGARG AAACCAAAAT 540 

GAAAGAGCGA AGGTGTGCTT TGAGAAGGCT CTGGAAAAGA AGCCAAAGAA CCCAGAATTC 600 

ACCTCTGGAC TGGCAATAGC AAGCTACCGT CTGGACAACT GGCCACCATC TCAGAACGCC 660 

ATTGACCCTC TGAGGCAAGC CATTCGGCTG AATCCTGACA ACCAGTACCT TAAAGTCCTC 720 

CTGGCTCTGA AGCTTCATAA GATGCGTGAA GAAGGTGAAG AGGAAGGTGA AGGAGAGAAG 780 

TTAGTTGAAG AAGCCTTGGA GAAAGCCCCA GGTGTAACAG ATGTACTTCG CAGTGCAGCC 840 

AAGTTTTATC GAAGAAAAGA TGAGCCAGAC AAAGCGATTG AACTGCTTAA AAAGGCTTTA 900 

GAATACATAC CAAACAATGC CTACCTGCAT TGCCAAATTG GGTGCTGCTA TAGGGCAAAA 960 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



GTCTTCCAAG 
CTAATAGGAC 
CGTGTCTGTT 
TATTACTTCC 
CTGCGGTATG 
TTTATAGAGG 
CTGCAAAAAA 
GTCTTGGCAT 
AGGGGTTTGG 
ATAGAGATGT 
GGTATTCAAA 
CACTGGACAG 
GGGAGAGGGA 
CTCTTTGCGG 



TAATGAATCT 
ACGCTGTGGC 
CCATTCTTGC 
AAAAGGAATT 
GCAACTTTCA 
GTGTAAAAAT 
TTGCCAAAAT 
TCCTTCAGGA 
AGTCTGGAAG 
GGTGCCCACT 
ATATGTAATG 
GGTTATGTTA 
CAGATTGGGG 
AACTTC 



AAGAGAGAAT 
TCATCTGAAG 
CAGCCTCCAT 
CAGTAAAGAG 
GCTGTACCAA 
AAACCAGAAA 
GCGACTTTCT 
GCTGAATGAA 
CCTCATCCCT 
AGGCTACTGC 
ACTGGTATGG 
AACCTGAATT 
GGTCGTCCAG 



GGAATGTATG 
AAAGCTGATG 
GCTCTAGCAG 
CTTACTCCTG 
ATGAAGTGTG 
TCAAGGGAGA 
AAAAATGGAG 
AAAATGCAAC 
TCAGCATCAA 
TGAAAGGGAG 
CAAAAGATTG 
GCTGGGTCTr 
GGCTGCGCTA 



Seq ID NO: 201 Protein sequence: 
Protein Accession #: AAA59191 



KHLKGQNEAA 
KHVCBKFSSP 
ASYRLDNWPP 
EKAPGVTDVL 
LRENGMYGKR 
FSKELTPVAK 
MRLSKNGADS 



11 

I 

SSLRQLKCHF 
LECLRKAEEL 
YRIESPELDC 
SQNAIDPLRQ 
RSAAKFYRRK 
KLLELIGHAV 
QLLHLRYGNF 
EALHVLAFLQ 



21 
I 

TWNLMEGENS 
IQQEHADQAE 
EEGWTRLKCG 
AIRLNPDNQY 
DBPDKAIELL 
AHLKKADEAN 
QLYQMKCEDK 
ELNEKMQQAD 



31 
I 

LDDFEDKVFY 
IRSLVTWGNY 
GNQNERAKVC 
LKVLLALKLH 
KKALEYIPNN 
DNLFRVCSIL 
AIHHFIEGVK 



GGAAAAGAAA 
AGGCCAATGA 
ATCAGTATGA 
TAGCGAAACA 
AAGACAAGGC 
AAGAAAAGAT 
CAGATTCTGA 
AAGCAGATGA 
GCTGGAATGG 
CTGAAATTCC 
GACTAAGACA 
AAAAGAGCCC 
AATTATTCTC 



41 

I 

RTEFQNREFK 
AWVYYHMGRL 
FEKALEKKPK 
KMREEGEEBG 
AYLHCQIGCC 
ASLHALADQY 
INQKSREKEK 
SLIPSASSWN 



GTTACTGGAA 
TAATCTCTTC 
AGAAGCAGAG 
ACTGCTCCAT 
CATCCACCAC 
GAAAGACAAA 
GGCTTTGCAT 
AGACTCTGAG 
GGAATGAAGA 
TCCACAAGTT 
CTGGCCATAC 
AAGGAGTTCT 
AATGATTTGT 



51 
I 

ATMCNIiLAYL 
SDVQIYVDKV 
NPEFTSG1AI 
EGEKLVEEAL 
YRAKVFQVMN 
EDAEYYFQKE 
MKDKLQKIAK 
GE 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 



60 
120 
180 
240 
300 
360 
420 



Seq ID NO: 202 DNA sequence 
Nucleic Acid Accession #: NM_003090 

Coding sequence: 57-824 (underlined sequences correspond to start and stop codons) 



GAATTCCGCG 
TCAAGCTGAC 
GGGAGCTGGA 
ACCAGTTTGA 
TGTTGAGAAG 
GACTTGATCA 
AACTGGGTGA 
GAAATCCGGT 
TCAGAGTACT 
TCAAGGGCAA 
ATCCAGGTGC 
AAGCAATCAA 
GGTTGCTGCA 
GTGAAGAAGA 
ATAATAATAG 
TGTGTTAGCA 
TTTGTAATAT 
CACTCTCTAT 



11 
I 

GGAGGCCACG 
GGCGGAGCTG 
CCTCCGGGGG 
TGCTATTGAT 
ACTGAAAACA 
GGCTCTGCCC 
TCTGGACCCT 
AACCAATAAG 
GGATTTCCAG 
AOGGGGTGCA 
TGGTTTGCCA 
GAATGCCATA 
GTCTGGTCAG 
GATGGAAGAA 
GCCCTCTTGG 
AAGTGGAATC 
AAGTrTTGAA 
GCTAAAAAAA 



21 
I 

GGCTTTCCAC 
ATCGAGCAGG 
TATAAAATTC 
TTTTCTGACA 
TTGTTAGTGA 
TGTCTGACAG 
CTGGCATCTC 
AAGCATTACA 
AAAGTGAAAC 
CAGCTTGCAA 
ACTGACAAAA 
GCAAATGCTT 
ATCCCTGGCA 
GACACAGTCA 
AACAAGTCTT 
TATCAGCATT 
ATCTAAATGT 
AAAAAAAGGA 



31 
I 

AGCGCGGGGG 
CGGCGCAGTA 
CCGTCATTGA 
ATGAGATCAG 
ACAACAACAG 
AACTCATTCT 
TCAAATCGCT 
GATTGTATGT 
TAAAAGAGCG 
AGGATATTGC 
AGAGAGGTGG 
CAACTCTGGC 
GAGAACGCAG 
CAAACGGGTC 
GCTTTTCGAA 
GTTGAAATGC 
CAATTTTCTA 
ATTC 



41 


51 




1 

AACGGGAGGC 


1 

TGCAGGATGG 


60 


CACCAACGCG 


GTGCGCGACC 


120 


AAATCTAGGT 


GCTACGTTAG 


180 


GAAACTGGAT 


GGTTTTCCTT 


240 


AATATGCCGT 


ATAGGTGAGG 


300 


CACCAATAAT 


AGTCTCGTGG 


360 


GACTTACCTA 


AGTATCCTAA 


420 


GATTTATAAA 


GTTCCGCAAG 


480 


TCAGGAAGCA 


GAGAAAATGT 


540 


CAGGAGAAGC 


AAAACTTTTA 


600 


GCCATCTCCA 


GGGGATGTAG 


660 


TGAAGTGGAG 


AGGCTGAAGG 


720 


ATCAGGGCCC 


ACTGATGATG 


780 


CTGAGCAGTG 


AGGCAGATGT 


840 


CATGGTATAA 


TAGCCTTGTT 


900 


TTAAGACTGC 


TGCTGATAAT 


960 


CAAATTATAA 


AAATAAACTC 


1020 



Seq ID NO: 203 Protein sequence: 
Protein Accession #: NP 003081.1 



1 11 21 31 41 51 

60. | | | | | | 

MVKLTAELIE QAAQYTNAVR DRELDLRGYK IPVIENLGAT IjDQFDAIDFS DNEIRKLDGF 60 
PLLRRLKTLL VNNNRICRIG EGLDQALPCL TELILTNNSL VELGDLDPLA SLKSLTYLSI 120 
LRNPVTNKKH YRLYVTYKVP QVRVLDFQKV KLKERQEAEK MFKGKRGAQL AKDIARRSKT 180 
FNPGAGLPTD KKRGGPSPGD VEAIKNAIAN ASTIAEVERI, KGLLQSGQIP GREKRSGPTD 240 
65 DGEEEMEEDT VTNGS 



70 



75 



Seq ID NO: 204 DNA sequence 

Nucleic Acid Accession #: NM_017643.l 

Coding sequence: 169-1401 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 
I 



AATAGCAATA GCTTTATAGC AGCTCCGGTT ACCTGTTTTA AACATGGAAG GAGAGTCGCT 
CCCAGATAGC CCTCACGAGT GGCCCTGGAG CAGGGAGTGG TGGAGCAGAT CTTCCTTGTT 
TGGGAGGAGC CTGAGGTGGA CCTCGCGTCC TGAGTCTGGA AGGCACCTAT GG GGACCTGC 
TGGGGTGATA TCTCAGAAAA TGTGAGAGTA GAAGTTCCCA ATACAGACTG CAGCCTACCT 



60 
120 
180 
240 
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ACOVAAGTCT TCTGGATT6C TGGAATTGTA 
TATGAAGGAT TTGAAAATGA CTCTGGTCTG 
ATCCATCCAG TTGGTTGGTG TGCAGCCAGC 
CAGCATAAAT ATACAAACTG GAAAGCTTTT 
CTGCCTCCTG ATTTCTCCCA AAAGGTTTCA 
ATGAGAGTAG AAGTGGTTGA CAAGAGGCAT 
AGTGTAATTG GAGGAAGATT AAGACTAGTG 
TTCTGGTGCC ATATGCACAG CCCATTAATA 
CATCGATTCA AAAGATCTGA TATTACAAAG 
CATTTATTTG CTAAGGTAAA AGAAGTAGAC 
AAATTGGAAG CTATAGACCC ATTAAATCTT 
GTGCTAGCT6 ACGGATTCCT GATGATTGGG 
GACTGGTTCT GTTACCATGC AACCTCTCCT 
AACATGATTG AACTTACTCC ACCCAGAGGT 
TACCTCAGGG AAACTGGCTC CATTGCAGCA 
AATCACGGAT TTCGTGTAGG AATGAAATTA 
ATATGTGTAG CCACAGTAAC TCGAATTATT 
TGGGAAGAAG AGTATGATCA GTGGGTAGAC 
TGGTGTCAGT TAACTGGATA TCAACTACAG 
AGAAAAGGTG TCCTTTTG TA AA AATCAGCA 
TCTTATGAGC TCACAGGACA AGAATATACC 
AAGACTCAAC AACAATATCA CAGAATCAGA 
AGTCAATTAC ATAATGACTA TAGAAACACA 
TTTAGTGAGT TAAAAATTAC ATACTAAAAG 
AAATAGTGGA AAATGTCTCA TGTTGAGGCT 
TCAGAGCATC ATGTACTTAA GTATAATGGT 
CTAGACAACT GTATCGTCTA AATTGTAAAC 

ttitcagcat caagagaaaa ccaatcagct 
cagcaataca aaggacataa gaaaagtggg 

ATTGTTAACT AATTGGAGTC ACAGTATTCT 
TGATGATTGT GCATTATGTA TTATGCTTAA 
TGCAATAATG AGAAACACTG ATATTTTACT 
AGTATACGTG GTAAAGAATA GAGTCTGTGA 
TGAAAGTTAA TCCTTTTTAA AAACTTTATT 
GGCTCACGCC TGTAATCCCA GCACTTTAGG 
AGATCGAGAC CATCCTGGCT AACACGGTGA 
CTGCCGGGCG TGGTGGCACA CGCCTGAAGT 
ATCACTTGAA CCCAGGAGGC AGAGGTTGCA 
CTGGGCAACA CAGCAAGACT CTGTCTCAAA 



AAATTAGCAG GTTACAATGC CCTTTTAAGA 300 

GACTTCTGGT GCAATATATG TGGTTCTGAT 360 

GGAAAACCTC TTGTTCCTCC TAGAACTATT 420 

CTAGTGAAAC GACTTACTGG TGCCAAAACA 480 

GAGAGTATGC AGTATCCTTT CAAACCTTGC 540 

TTGTGTCGAA CACGAGTAGC AGTGGTGGAA 600 

TATGAAGAAA GCGAAGATAG AACAGATGAC 660 

CATCATATTC GTTGGTCTCG AAGCATAGGT 720 

AAACAGGATG GACATTTTGA TACACCACCA 780 

CAGAGTGGGG AATGGTTCAA GGAAGGAATG 840 

TCTACAATAT GTGTCGCAAC CATTAGAAAG 900 

ATCGATGGCT CAGAAGCAGC AGACGGATCT 960 

TCTATTTTCC CTGTCGGTTT CTGTGAAATT 1020 

TACACAAAAC TTCCTTTTAA ATGGTTTGAC 1080 

CCAGTAAAAC TATTTAATAA GGATGTTCCA 1140 

GAAGCAGTAG ATCTCATGGA GCCACGTTTA 1200 

CATCGTCTCT TGAGGATACA TTTTGATGGA 1260 

TGTGAGTCAC CTGACCTCTA TCCTGTAGGG 1320 

CCTCCAGCAT CACAGTGTAA GTTGGTATAC 1380 

ATTCTCCAGA GGACTATCTC ACATAAGTCA 1440 

TATGTCTGAT TGGTTGCCAG GTAAGACATT 1500 

CCATGTGTCC CATGGCAATG TGAATCCAAT 1560 

ACAGTCACCA AATTAAACTA GACTTACTAT 1620 

TTTATTGGTA GGTAATAAAT GCTTTTGAGT 1680 

ATGGTTTTGT AGGAACAAGT ACCCTTATTT 1740 

CTTGGTAAAG ATAGTTCATA TAAGTTGTAT 1800 

AATTATCTAG TACCAATTTT CCCTTTTTAT 1860 

TCATCAAAAC AGAAGAAAAA GGCTAAGTCC 1920 

TCACCACGTG GTGTTCACAT ACATTTTCTA 1980 

TGGACAGAAA ATGATATATC TTGTGAGAAC 2040 

AGGTGCAGTA TGCCATAAAA GGCAAACCCT 2100 

AACAGGAGAA ATGATTACCA CAGTATTTAA 2160 

ATGATTCTTG AAATAATATG TAAAACCTAC 2220 

TAAAAAGAAA AATTAGCAGC CAGGTGCAGT 2280 

AGGCCGAGGC TGGCAGATCA CAAGGTCAGG 2340 

AACCCTGTCT CCACCAAAAA TACAAAAAAT 2400 

CCCAGCTACT CAGGAGGCTG AGGCAAGAGA 2460 

GTGGGCCAAG ATCACGCCAC TACATTCCAG 2520 
AAAAAAAAAA AAAA 



Seq ID NO: 205 Protein sequence: 
Protein Accession #: NP_060113.1 

1 11 '21 31 41 51 

I I I I I I 

MGTCWGDISE NVRVEVPNTD CSLPTKVFWI AGIVKLAGYN ALLRYEGFEN DSGLDFWCNI 60 
CGSDIHPVGW CAASGKPLVP PRTIQHKYTN WKAFLVKRLT GAKTLPPDFS QKVSESMQYP 120 
FKPCMRVEW DKRHLCRTRV AWESVIGGR LRLVYEESED RTDDFWCHMH SPLIHHIGWS 180 
RSIGHRFKRS DITKKQDGHF DTPPHLFAKV KEVDQSGEWP KBGMKLEAID PLNLSTICVA 240 
TIRKVLADGF LMIGIDGSEA ADGSDWFCYH ATSPSIFPVG FCEINMIELT PPRGYTKLPF 300 
KWFDYLRETG SIAAPVKLFN KDVPNHGFRV GMKLEAVDLM EPRLICVATV TRIIHRLLRI 360 
HFDGWEEEYD QWVDCESPDL YPVGWCQLTG YQLQPPASQC KLVYRKGVLL 



Seq ID NO: 206 DNA sequence 
Nucleic Acid Accession #: NM_012334 

Coding sequence: 223-6399 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I 1 I 

GAGACAAAGG CTGCCGTCGG GACGGGCGAG TTAGGGACTT GGGTTTGGGC GAACAAAAGG 60 

TGAGAAGGAC AAGAAGGGAC CGGGCGATGG CAGCAGGGGA GCCCCGCGGG CGCGCGTCCT 120 

CGGGAGTGGC GCCGTGACAC GCATGGTTTC CCCGGACCCG CGGCGGCGCT GACTTCCGCG 180 

AGTCGGAGOG GCACTCGGCG AGTCCGGGAC TGOGCTGGAA CAATGGATAA CTTCTTCACC 240 

GAGGGAACAC GGGTCTGGCT GAGAGAAAAT GGCCAGCATT TTCCAAGTAC TGTAAATTCC 300 

TGTGCAGAAG GCATOGTCGT CTTCCGGACA GACTATGGTC AGGTATTCAC TTACAAGCAG 360 

AGCACAATTA CCCACCAGAA GGTGACTGCT ATGCACCCCA OGAACGAGGA GGGCGTGGAT 420 

GACATGGCGT CCTTGACAGA GCTCCATGGC GGCTCCATCA TGTATAACTT ATTCCAGCGG 480 

TATAAGAGAA ATCAAATATA TACCTACATC GGCTCCATCC TGGCCTCCGT GAACCCCTAC 540 

CAGCCCATOG CCGGGCTGTA CGAGCCTGCC ACCATGGAGC AGTACAGCCG GOGCCACCTG 600 

GGCGAGCTGC CCCCGCACAT CTTCGCCATC GCCAACGAGT GCTACCGCTG CCTGTGGAAG 660 

CGCTACGACA ACCAGTGCAT CCTCATCAGT GGTGAAAGTG GGGCAGGTAA AACCGAAAGC 720 

ACTAAATTGA TCCTCAAGTT TCTGTCAGTC ATCAGTCAAC AGTCTTTGGA ATTGTCCTTA 780 

AAGGAGAAGA CATCCTGTGT TGAACGAGCT ATTCTTGAAA GCAGCCCCAT CATGGAAGCT 840 

TTCGGCAATG CGAAGACCGT GTACAACAAC AACTCTAGTC GCTTTGGGAA GTTTGTTCAG 900 
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CTGAACATCT GTCAGAAAGG AAATATTCAG 
AAAAACCGAG TAGTAAGGCA AAATCCCGGG 
CTGGCAGGGC TGGAACATGA AGAAAGAGAA 
CACTACTTGA ATCAGTCTGG ATGTGTAGAA 
5 AGGGAAGTTA TTACGGCAAT GGACGTGATG 
TCGAGGCTGC TTGCTGGTAT ACTGCATCTT 
GCACAGGTTT CCTTCAAAAC AGCTTTGGGC 
ACACAGCTCA CAGATGCTTT GACCCAGAGA 
ACGCCTCTCA ATGTTCAACA GGCAGTAGAC 

10 GCGTGCTGCT TTGAGTGGGT AATCAAGAAG 
TTCAAGTCTA TTGGCATCCT CGACATCTTT 
GAACAGTTCA ATATAAACTA TGCAAACGAG 
TTTTCTTTAQ AACAACTAGA ATATAGCCGG 
ATAGACAATG GAGAATGCCT GGACTTGATT 

15 AATGAAGAAA GCCATTTTCC TCAAGCCACA 
CAGCATGCGA ATAACCACTT TTATGTGAAG 
AAGCACTATG CTGGAGAGGT GCAATATGAT 
ACATTTOGAG ATGACCTTCT CAATTTGCTA 
CTTTTTGAAC ATGTTTCAAG CCGCAACAAC 

20 CGGCGGCCTA CAGTCAGCTC ACAGTTCAAG 
AGCTCCTCTA ATCCTTTCTT TGTTCGCTGT 
CAGTTTGACC AGGCGGTTGT GCTGAACCAG 
AGAATCCGCA AAGCTGGGTA TGCGGTCCGA 
AAAGTGCTGA TGAGGAATCT GGCTCTGCCT 

25 CTGCAGCTCT ATGATGCCTC CAACAGCGAG 
CGAGAATCCT TGGAACAGAA ACTGGAGAAG 
ATGGTGATTC GGGCCCATGT CTTGGGCTTC 
TATTGTGTGG TGATAATACA GAAGAATTAC 
CACCTGAAAA AGGCAGCCAT AGTTTTCCAG 

30 GTTTACAGAC AATTGCTGGC AGAGAAAAGG 
GAAGAAAAGA AGAAACGGGA GGAAGAAGAA 
GAGCTCCGCG CCCAGCAGGA AGAAGAAACG 
AAGAGCCAGA AGGAAGCTGA ACTGACCOGT 
GTGGAAGAGA TCCTCCGTCT GGAGAAAGAA 

35 CAGGAGCTGT CGCTGACCGA GGCTTCCCTG 
CTCCGCAGGC TGGAGGAGGA AGCGTGCAGG 
TTCGACGAGA TCGACGAGTG TGTCCGGAAT 
TTTTCCAGCG AGCTGGCTGA GAGCGCATGC 
CCCTACCCAG AGGAGGAGGT CGATGAGGGC 

40 TCCCCCAACC CCAGCGAGCA CGGCCACTCA 
GATGACTCTT CAGAGGAGGA CCCATACATG 
GCGGACAGCA CGGTGCTGCT CGCCCCATCA 
TCCAGCGGCG AGTCCACCTA CTGCATGCCC 
GGCGACTACG ACTACGACCA GGATGACTAT 

45 GTGACCTTCT CCAACTCCTA CGGCAGCCAG 
ACCTACAACA GCTCGGGTGC CTACCGGTTC 
GATAGTGAAG AGGACTTTGA TTCCAGGTTT 
GACTCTGTGT ACAGCTGTGT CACTCTGCCG 
GGCCTGATGA ACTCTTGGAA ACGCCGCTGG 

50 TTCCGCTCCA AGCAGGAGGC CCTCAAGCAA 
TCCACGCTGT CCAGGAGAAA TTGGAAGAAG 
ATGTACTTTG AAAACGACAG CGAGGAGAAG 
AAAGAGATCA TAGATAACAC CACCAAGGAG 
ACTTTCCACC TGATTGCAGA GTCCCCAGAA 

55 CAGGTCCACG CGTCCACGGA CCAGGAGATC 
CAGAATGCTG TGGGCACCTT GGATGTGGGG 
CCTGATAGAC CCAACTCGTT TGTGATCATC 
GACACGCCGG AGGAGATGCA CCACTGGATA 
AGAGTGGAGG GCCAGGAATT CATCGTGAGA 

60 CCGAAGATGT CTTCACTGAA ACTGAAGAAA 
GATTACTACA AGAGTTCAGA GAAGAACGCG 
CTCTGCTCTG TCGTCCCCCC AGATGAGAAG 
ACCGTGTACG GGCGCAAGCA CTGTTACCGG 
• CGGTGGTCCA GTGCCATTCA AAACGTGACT 

65 CAGCAGCTGA TTCAAGATAT CAAGGAGAAC 
TACAAGCGGA ACCCGATCCT TCGATACACC 
CTTCCGTATG GGGACATAAA TCTCAACTTG 
GATGAGGCCA TCAAGATATT CAATTCCCTG 
CCAATAATCC AGGGCATCCT ACAGACAGGG 

70 TACTGCCAGC TTATCAAACA GACCAACAAA 
TACAGCTGGC AGATCCTGAC ATGCCTGAGC 
AAGTATCTCA AGTTCCATCT GAAAAGGATA 
AAATACGCTC TCTTCACTTA CGAATCTCTT 
TCCCGAGATG AAATAGAAGC TCTGATCCAC 

75 CATGGCGGCG GCTCCTGCAA GATCACCATC 
GAGAAGCTGA TCCGAGGCCT GGCCATGGAG 



GGCGGGAGAA TTGTAGATTA TTTATTAGAA 960 

GAAAGGAATT ATCACATATT TTATGCACTG 1020 

GAATTTTATT TATCTACGCC AGAAAACTAC 1080 

GACAAGACAA TCAGTGACCA GGAATCCTTT 1140 

CAGTTCAGCA AGGAGGAAGT TCGGGAAGTG 1200 

GGGAACATAG AATTTATCAC TGCTGGTGGG 1260 

AGATCTGCGG AGTTACTTGG GCTGGACCCA 1320 

TCAATGTTCC TCAGGGGAGA AGAGATCCTC 1380 

AGCAGGGACT CCCTGGCCAT GGCTCTGTAT 1440 

ATCAACAGCA GGATCAAAGG CAATGAGGAC 1500 

GGATTTGAAA ACTTTGAGGT TAATCACTTT 1560 

AAACTTCAGG AGTACTTCAA CAAGCATATT 1620 

GAAGGATTAG TGTGGGAAGA TATTGACTGG 1680 

GAGAAGAAAC TTGGCCTCCT AGCCCTTATC 1740 

GACAGCACCT TATTGGAGAA GCTACACAGT 1800 

CCCAGAGTTG CAGTTAACAA TTTTGGAGTG 1860 

GTCCGAGGTA TCTTGGAGAA GAACAGAGAT 1920 

AGAGAAAGCC GATTTGACTT TATCTACGAT 1980 

CAGGATACCT TGAAATGTGG AAGCAAACAT 2040 

GACTCACTGC ATTCCTTAAT GGCAACGCTA 2100 

ATCAAGCCAA ACATGCAGAA GATGCCAGAC 2160 

CTGCGGTACT CAGGGATGCT GGAGACTGTG 2220 

AGACCCTTTC AGGACTTTTA CAAAAGGTAT 2280 

GAGGACGTCC GAGGGAAGTG CACGAGCCTG 2340 

TGGCAGCTGG GGAAGACCAA GGTCTTTCTT 2400 

CGGAGGGAAG AGGAAGTGAG CCACGCGGCC 2460 

TTAGCACGAA AACAATACAG AAAGGTCCTT 2520 

AGAGCATTCC TTCTGAGGAG GAGATTTTTG 2580 

AAGCAACTCA GAGGTCAGAT TGCTCGGAGA 2640 

GAGCAAGAAG AAAAGAAGAA ACAGGAAGAG 2700 

AGAGAAAGAG AG AG AG AG CG AAGAGAAGCC 2760 

AGGAAGCAGC AAGAACTCGA AGCCTTGCAG 2820 

GAACTGGAGA AACAGAAGGA AAATAAGCAG 2880 

ATCGAGGACC TGCAGCGCAT GAAGGAGCAG 2940 

CAGAAGCTGC AGGAGCGGCG GGACCAGGAG 3000 

GCGGCCCAGG AGTTCCTCGA GTCCCTCAAT 3060 

ATCGAGCGGT CCCTGTCGGT GGGAAGCGAA 3120 

GAGGAGAAGC CCAACTTCAA CTTCAGCCAG 3180 

TTCGAAGCCG ACGACGACGC CTTCAAGGAC 3240 

GACCAGOGAA CAAGTGGCAT CCGGACCAGC 3300. 

AACGACACGG TGGTGCCCAC CAGCCCCAGT 3360 

GTGCAGGACT CCGGGAGCCT ACACAACTCC 3420 

CAGAACGCTG GGGACTTGCC CTCCCCAGAC 3480 

GAGGACGGTG CCATCACTTC CGGCAGCAGC 3540 

TGGTCCCCCG ACTACCGCTG CTCTGTGGGG 3600 

AGCTCTGAGG GGGCGCAGTC CTCGTTTGAA 3660 

GATACAGATG ATGAGCTTTC ATACCGGCGT 3720 

TATTTCCACA GCTTTCTGTA • CATGAAAGGT 3780 

TGCGTCCTCA AGGATGAAAC CTTCTTGTGG 3840 

GGCTGGCTCC ACAAAAAAGG GGGGGGCTCC 3900 

CGCTGGTTTG TCCTCCGCCA GTCCAAGCTG 3960 

CTCAAGGGCA CCGTAGAAGT GCGAACGGCA 4020 

AATGGGATCG ACATCATTAT GGCCGATAGG 4 080 

GATGCCAGCC AGTGGTTCAG CGTGCTGAGT 4140 

CAGGAGATGC ATGATGAGCA GGCAAACCCA 4200 

CTGATTGATT CTGTGTGTGC CTCTGACAGC 4260 

ACGGCCAACC GGGTGCTGCA CTGCAACGCC 4320 

ACCCTGCTGC AGAGGTCCAA AGGGGACACC 4380 

GGATGGTTGC ACAAAGAGGT GAAGAACAGT 4440 

CGGTGGTTTG TACTCACCCA CAATTCCCTG 4500 

CTCAAACTGG GGACCCTGGT CCTCAACAGC 4560 

ATATTCAAAG AGACAGGCTA CTGGAACGTC 4620 

CTCTACACCA AGCTGCTCAA CGAGGCCACC 4680 

GACACCAAGG CCCCGATCGA CACCCCCACC 4740 

TGCCTGAACT CGGATGTGGT GGAACAGATT 4800 

CATCACCCCT TGCACTCCCC GCTCCTGCCC 4860 

CTCAAAGACA AAGGCTATAC CACCCTTCAG 4920 

CAGCAACTGG AGTCCATGTC TGACCCAATT 4980 

CATGACCTGC GACCTCTGCG GGACGAGCTG 5040 

GTGCCCCACC CCGGCAGTGT GGGCAACCTG 5100 

TGCACCTTCC TGCCGAGTCG AGGGATTCTC 5160 

CGGGAACAGT TTCCAGGAAC CGAGATGGAA 5220 

AAGAAAACCA AATGCCGAGA GTTTGTGCCT 5280 

AGGCAGGAAA TGACATCCAC GGTCTATTGC 5340 

AACTCCCACA CCACTGCTGG GGAGGTGGTG 5400 

GACAGCAGGA ACATGTTTGC TTTGTTTGAA 5460 
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TACAACGGCC AOGTCGACAA AGCCATTGAA AGTCGAACCG TCGTAGCTGA TGTCTTAGCC 5520 

AAGTTTGAAA A6CT66CT6C CACATCCGAG GTTGGGGACC TGCCATGGAA ATTCTACTTC 5580 

AAACTTTACT GCTTCCTGGA CACAGACAAC GTGCCAAAAG ACAGTGTGGA GTTTGCATTT 5640 

ATGTTTGAAC AGGCCCACGA AGCGGTTATC CATGGCCACC ATCCAGCCCC GGAAGAAAAC 5700 

5 CTCCAGGTTC TTGCTGCCCT GCGACTCCAG TATCTGCAGG GGGATTATAC TCTGCACGCT 5760 

GCCATCCCAC CTCTCGAAGA GGTTTATTCC CTGCAGAGAC TCAAGGCCCG CATCAGCCAG 5820 

TCAACCAAAA CCTTCACCCC TTGTGAACGG CTGGAGAAGA GGCGGACGAG CTTCCTAGAG 5880 

GGGACCCTGA GGCGGAGCTT CCGGACAGGA TCCGTGGTCC GGCAGAAGGT CGAGGAGGAG 5940 

CAGATGCTGG ACATGTGGAT TAAGGAAGAA GTCTCCTCTG CTCGAGCCAG TATCATTGAC 6000 

10 AAGTGGAGGA AATTTCAGGG AATGAACCAG GAACAGGCCA TGGCCAAGTA CATGGCCTTG 6060 

ATCAAGGAGT GGCCTGGCTA TGGCTCGAOG CTGTTTGATG TGGAGTGCAA GGAAGGTGGC 6120 

TTCCCTCAGG AACTCTGGTT GGGTGTCAGC GCGGACGCCG TCTCCGTCTA CAAGCGTGGA 6180 

GAGGGAAGAC CACTGGAAGT CTTCCAGTAT GAACACATCC TCTCTTTTGG GGCACCCCTG 6240 

GCGAATAOGT ATAAGATCGT GGTCGATGAG AGGGAGCTGC TCTTTGAAAC CAGTGAGGTG 6300 

15 GTGGATGTGG CCAAGCTCAT GAAAGCCTAC ATCAGCATGA TCGTGAAGAA GCGCTACAGC 6360 

ACGACACGCT CCGCCAGCAG CCAGGGCAGC TCCAGGTGAA GGCGGGACAG AGCCCACCTG 6420 

TCTTTGCTAC CTGAACGCAC CACCCTCTGG CCTAGGCTGG CTCCAGTGTG CCATGCCCAG 6480 

CCAAAACAAA CACAGAGCTG CCCAGGCTTT CTGGAAGCTT CTGGTCTGAG GGAGGTGTCT 6540 

COGAGGATCC TTTTGCCTGC CGCCTTCATT GATCCTGTAT TAAGCTGTCA ACTTTAACAG 6600 

20 TCTGCACAGT TTCCAAAGCT TTACTACTCT TAGAGGACAC ATGCCTTAAA AAAGGAGGGG 6660 

AGGAACCACG CTGCCACCAA AGCAGCCGGA AGTGCCTTAA CTTGTGGAAC CAACACTAAT 6720 

CGACCGTAAC TGTGCTACTG AAGGGAACTG CCTTTCCCCC TTCTGGGGGA GACTTAACAG 6780 

AGCGTGGAAG GGGGG CATTC TCTGTCAATG ATGCACTAAC CTCCCAACCT GATTTCCCCG 6840 

AATCTGAGGG AAGGTGAGGG AGTGGGAAGG GGGATGGAGA GCTCGAGGGG ACAGTGTGTT 6900 

25 TGAGCTGGAG TGCTGCGGGC AGCCTTTCTC ATGGAATGAC ATGAATCAAC TTTTTTCTTT 6960 

GTTTCATCTT TTAAGTGTAC GTGCTTGCCT GTTCGTGCAT GTGTTCATAA ACTCAACACT 7020 

TTAATCATGG TTTCATGAGC ATTAAAAAGC AAAGGGAAAA AGGATGTGTA ATGGTGTACA 7080 

CAGTCTGTAT ATTTTAATAA TGCAGAGCTA TAGTCTCAAT TGTTACTTTA TAAGGTGGTT 7140 

TTATTAACAA ACCCAAATCC TGGATTTTCC TGTCTTTGCT GTATTTTGAA AAACAOGTGT 7200 

30 TGACTCCATT GTTTTACATG TAGCAAAGTC TGCCATCTGT GTCTGCTGTA TTATAAACAG 7260 

ATAAGCAGCC TACAAGATAA CTGTATTTAT AAACCACTCT TCAACAGCTG GCTCCAGTGC 7320 

TGGTTTTAGA ACAAGAATGA AGTCATTTTG GAGTCTTTCA TGTCTAAAAG ATTTAAGTTA 7380 

AAAACAAAGT GTTACTTGGA AGGTTAGCTT CTATCATTCT GGATAGATTA CAGATATAAT 7440 

AACCATGTTG ACTATGGGGG AGAGACGCTG CATTCCAGAA ACGTCTTAAC ACTTGAGTGA 7500 

35 ATCTTCAAAG GACCCTGACA TTAAATGCTG AGGCTTTAAT ACACACATAT TTTATCCCAA 7560 

GTTTATAATG GTGGTCTGAA CAAGGCACCT GTAAATAAAT CAGCATTTAT GACCAGAAGA 7620 

AAAATAATCT GGTCTTGGAC TTTTTATTTT TATATGGAAA AGTTTTAAGG ACTTGGGCCA 7680 

ACTAAGTCTA CCCACACGAA AAAAGAAATT TGCCTTGTCC CTTTGTGTAC AACCATGCAA 7740 
AACTGTTTGT TGGCTCACAG AAGTTCTGAC AATAAAAGAT ACTAGCT 



40 



45 



Seq ID NO: 207 Protein sequence; 
Protein Accession #: NP_036466 

1 11 21 31 41 51 



MDNFFTEGTR VWLRENGQHP PSTVNSCABG IWFRTDYGQ VFTYKQSTIT HQKVTAMHPT 60 

NEEGVDDMAS LTSLHGGSIM YNLFQRYKRN QIYTYIGSIL ASVNPYQPIA GLYEPATMEQ 120 

YSRRHLGELP PHIPA1ANEC YRCIiWKRYDN QCILISGESG AGKTESTKLI LKFLSVISQQ 180 

5LELSLKEKT SCVERAILES SPIMEAFGNA KTVYNNNSSR FGKFVQLNIC QKGNIQGGRX 240 

50 VDYLLEKNRV VRQNPGBRNY HIFYALLAGL EHEEREEFYL STPENYHYLN QSGCVEDKTI 300 

SDQESFREVI TAMDVMQFSK EEVREVSRLL AGILHLGHIB PITAGGAQVS FKTALGRSAE 360 

LLGLDPTQLiT DALTQRSMFL RGEEILTPLN VQQAVDSRDS IAMALYACCF EWVIKKINSR 420 

IKGNEDFKSI GILDIFGFEN FEVNHFEQFN INYANEKLQE YFNXHIFSLE QLEYSREGLV 480 

WEDIDWIDNG ECLDLIEKKL GLiLALINEES KFPQATDSTL IiEKLHSQHAN NHFYVKPRVA 540 

55 .VNNFGVKHYA GEVQYDVRGI LEKNRDTFRD DLLNLLRESR FDFIYDLFEH VSSRNNQDTL 600 

KCGSKHRRPT VSSQFKDSLH SUWTLSSSN PFFVRCIKPN MQKMPDQFDQ AWLNQLRYS 660 

GMLETVRIRK AGYAVRRPFQ DFYKRYKVLM RNLALPEDVR GKCTSLLQLY DASNSEWQLG 720 

KTKVFLRESL EQKLBKRREE EVSHAAMVIR AHVLGFIARK QYRKVLYCW IIQKNYRAFL 780 

LRRRFLHLKK AAIVFQKQLR GQIARRVYRQ LLAEKREQEE KKKQEEEEKK KREEEERERE 840 

60 RERREAELRA QQEEETRKQQ ELEALQKSQK EAELTRELEK QKENKQVEEI LRLEKEIEDL 900 

QRMKEQQELS LTEASLQKLQ ERRDQELRRL BEEACRAAQE FTiESLNFDEI DECVRNIERS 960 

LSVGSBFSSE LARSACEEKP NFNFSQPYPE EEVDBGFEAD DDAFKDSPNP SEHGHSDQRT 1020 

SGIRTSDDSS EEDPYMNDTV VPTSPSADST VLIAPSVQDS GSLHNSSSGE STYCMPQNAG 1080 

DLPSPDGDYD YDQDDYKDGA ITSGSSVTFS NSYGSQWSPD YRCSVGTYNS SGAYRFSSEG 1140 

65 AQSSFEDSEB DFDSRFDTDD ELSYRRDSVY SCVTLPYFHS FLYMKGGLMN SWKRRWCVLK 1200 

DETFLWFRSK QEALKQGWLH KKGGGSSTLS RRNWKKRWFV LRQSKLMYFE NDSEBKLKGT 1260 

VEVRTAKEII DNTTKENGID IIMADRTFHL IAESPEDASQ WFSVLSQVHA STDQBIQEMH 1320 

DEQANPQNAV GTliDVGLIDS VCASDSPDRP NSFVIITANR VLHCNADTPE EMHHWlTLiLQ 1380 

RSKGDTRVEG QEFTVRGWLH KBVKNSPKMS SLKLKKRWFV LTHNSLDYYK SSEKNALKLG 1440 

70 TLVLHSLCSV VPPDEKIFKB TGYWNVTVYG RKHCYRLYTK LLNEATRWSS AIQNVTDTKA 1500 

PIDTPTQQLI QDIKENCLNS DWEQIYKRN PILRYTHHPL HSPLLPLPYG DINLNLLKDK 1560 

GYTTLQDEAI KIFNSLQQLE SKSDPIPIIQ GILQTGHDLR PLRDBLYCQL IKQTNKVPHP 1620 

GSVGNLYSWQ ILTCLSCTFL PSRGILKYLK FHLKRIREQF PGTEMEKYAL FTYESLKKTK 1680 

CREFVPSRDE IEALIHRQEM TSTVYCHGGG SCKITINSHT TAGEWEKLI RGLAMEDSRN 1740 

75 MFALFEYNGH VDKAIESRTV VADVLAKFEK LAATSEVGDL PWKFYFKLYC FLDTDNVPKD 1800 

SVEFAFMFEQ AHEAVIHGHH PAPEENLQVL AALRLQYLQG DYTLHAAIPP LEEVYSLQRL 1860 
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KARISQSTKT FTPCBRLSKR RTSFLEGTLR RSPRTGSWR QKVEEEQMLD MWIKEEVSSA 1920 

RASIIDKWRK FQGMNQEQAM AKYMALIKEW PGYGSTLFDV ECKEGGFPQB LWLGVSADAV 1980 

SVYKRGEGRP LEVFQYEHIL SFGAPLANTY KIWDERELL PETSBWDVA KLMKAYISMI 2040 
VKKRYSTTRS ASSQGSSR 

Seq ID NO: 208 DNA sequence 

Nucleic Acid Accession ft: XM_0S9761.l 

Coding sequence: 124-525 (underlined sequences correspond to start and stop codons) 



CGAAGATCTA 
GATGATCTCA 
GGCATGGCTC 
TCTCTGAGCG 
GATTCTGTAA 
TCAAATACCC 
GTGAGAAGTT 
GGCTGCCGTC 
ATTTTCTGTT 
CTCTGTGTAA 
TACTGCTTCT 
ACAACAGGTC 
TCAGAATGAA 
TGGAGGTGTT 



11 
I 

TCCAAAATCA 
ATCATGTGGA 
TTATG6AAGT 
AGACAGTGAA 
ATGAAACCCA 
AAGATGCTTC 
ACAACTCTGA 
CTTGTGAGGA 
TCAAGCTTCT 
CACTAACATT 
ATTTTGAAAA 
CTAGCATGTA 
TGCAGTTGTG 
TGTTTTCTCC 



21 
1 

AGAAGCCTTT 
TTTGAATGTG 
TAACCTATTA 
GAAAGTGGAA 
GTTTTGTGTT 
AGTGTCCATA 
AGTGAAGCTG 
TGGAGCTTCA 
GTACTTTATG 
TCCAGTAGTC 
AAGAGTTTTT 
TAGCTGCATA 
TGTCTATATT 
AGAATAAAGG 



31 
I 

GATTTAGATG 
TGTACAAGCT 
AGTGGCTTTA 
TATGATCATG 
AATATTCCTG 
GTGGATTACT 
TCCTCCTGTG 
GGCTCCCATC 
GAACTTTGGC 
ACATGTGATT 
TTTCTTTCTA 
GATTTCTTCA 
TTCCCCTCTC 
TATTACTTTA 



41 
I 

TTGCTGTAAA 
TTTCGGGCCC 
TGGTGCCTTC 
GAAAACTCAA 
CTGTGAGAAA 
ATGAGCCAAG 
ACCTTTGCAG 
ATCACTCTTC 
TGTGATTTAT 
GTTTTGTTTT 
TGGGGTTGCA 
CCTGATCTTT 
AAAATCTTTT 
G 



51 
I 

AGAAAATAAA 
GGGTAGGAGT 
AGAAGCAATT 
CCTCTATTTA 
CTTTAAAGTT 
GAGACAGGCG 
TGATGTCCAG 
AGTCATTTTT 
TTTTAAAGGA 
CGTAGAAGAA 
GGGATGGTGT 
GTGTGGAAGA 
AGAATTTTTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



Seq ID NO: 209 Protein sequence : 
Protein Accession #: XP 059761.1 
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21 



31 
I 



41 



51 



MALMEVNLLS GPMVPSBAIS LSETVKKVEY DHGKLNLYLD SVNETQFCVN IPAVRNFKVS 60 
NTQDASVSIV DYYEPRRQAV RSYNSBVKLS SCDLCSDVQG CRPCBDGASG SHHHSSVIFI 120 
FCFKLLYFME LWL 



Seq ID NO: 210 DNA sequence 
Nucleic Acid Accession #: NM_015472 

Coding sequence: 258-1460 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

GACACACTCC TCTACAACAC CAGAGACTCC CAAACACAAG GCCTTATATT GACTCATTTC 60 

AGCTCACATC CTGGCGACTC TCAAGAGAGA AACCTCAGAG TGACTAAAAT CTCCATAATG 120 

AGAAGACATG TACATTCAGT ATCTATTTTG GCATTTTCCC CAATACATCT CTGCTCATCT 180 

GACTCTTATC TTGGCATCTG CTTCCTGGTG GATCTGAACT GACCCATAAG CCACGCTTAC 240 

TGGTGATTTT CCAGAAGATG AATCCGGCCT CGGCGCCCCC TCCGCTCCCG CCGCCTGGGC 300 

AGCAAGTGAT CCACGTCACG CAGGACCTAG ACACAGACCT CGAAGCCCTC TTCAACTCTG 360 

TCATGAATCC GAAGCCTAGC TCGTGGCGGA AGAAGATCCT GCCGGAGTCT TTCTTTAAGG 420 

AGCCTGATTC GGGCTCGCAC TCGCGCCAGT CCAGCACCGA CTCGTCGGGC GGCCACCCGG 480 

GGCCTCGACT GGCTGGGGGT GCCCAGCATG TCCGCTCGCA CTCGTCGCCC GCGTCCCTGC 540 

AGCTGGGCAC CGGCGCGGGT GCTGCGGGTA GCCCCGCGCA GCAGCACGCG CACCTCCGCC 600 

AGCAGTCCTA CGACGTGACC GACGAGCTGC CACTGCCCCC GGGCTGGGAG ATGACCTTCA 660 

CGGCCACTGG CCAGAGGTAC TTCCTCAATC ACATAGAAAA AATCACCACA TGGCAAGACC 720 

CTAGGAAGGC GATGAATCAG CCTCTGAATC ATATGAACCT CCACCCTGCC GTCAGTTCCA 780 

CACCAGTGCC TCAGAGGTCC ATGGCAGTAT CCCAGCCAAA TCTCGTGATG AATCACCAAC 840 

ACCAGCAGCA GATGGCCCCC AGTACCCTGA GCCAGCAGAA CCACCCCACT CAGAACCCAC 900 

CCGCAGGGCT CATGAGTATG CCCAATGCGC TGACCACTCA GCAGCAGCAG CAGCAGAAAC 960 

TGCGGCTTCA GAGAATCCAG ATGGAGAGAG AAAGGATTCG AATGCGCCAA GAGGAGCTCA 1020 

TGAGGCAGGA AGCTGCCCTC TGTCGACAGC TCCCCATGGA AGCTGAGACT CTTGCCCCAG 1080 

TTCAGGCTGC TGTCAACCCA CCCACGATGA CCCCAGACAT GAGATCCATC ACTAATAATA 1140 

GCTCAGATCC TTTCCTCAAT GGAGGGCCAT ATCATTCGAG GGAGCAGAGC ACTGACAGTG 1200 

GCCTGGGGTT AGGGTGCTAC AGTGTCCCCA CAACTCOGGA GGACTTCCTC AGCAATGTGG 1260 

ATGAGATGGA TACAGGAGAA AACGCAGGAC AAACACCCAT GAACATCAAT CCCCAACAGA 1320 

CCCGTTTCCC TGATTTCCTT GACTGTCTTC CAGGAACAAA CGTTGACTTA GGAACTTTGG 1380 

AATCTGAAGA CCTGATCCCC CTCTTCAATG ATGTAGAGTC TGCTCTGAAC AAAAGTGAGC 1440 

CCTTTCTAAC CTGGCTGTAA TCACTACCAT TGTAACTTGG ATGTAGCCAT GACCTTACAT 1500 

TTCCTGGGCC TCTTGGAAAA AGTGATGGAG CAGAGCAAGT CTGCAGGTGC ACCACTTCCC 1560 

GCCTCCATGA CTCGTGCTCC CTCCTTTTTA TGTTGCCAGT TTAATCATTG CCTGGTTTTG 1620 
ATTGAGAGTA ACTTAAGTTA AACATAAATA AATATTCTAT TTTCATTTTC 
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MNPASAPPPL PPPGQQVIHV TQDLDTDLEA ItFNSVMNPKP SSWRKKILPE SFFKEPDSGS 60 

HSRQSSTDSS GGHPGPRLAG GAQHVRSHSS PASLQLGTGA GAAGSPAQQH AHLRQQSYDV 120 

TDELPLPPGW EMTPTATGQR -YFLNHIEKIT TWQDPRKAMN QPLNHMNLHP AVSSTPVFQR 180 

SMAVSQPNLV MNHQHQQQMA PSTLSQQNHP TQNPPAGLMS MPKALTTQQQ QQQKLRLQRI 240 

QMERERIRMR QEELMRQEAA LCRQLPMEAE TLAPVQAAVN PPTMTPDMRS ITNNSSDPPTj 300 

NGGPYHSREQ STDSGLGLGC YSVPTTPEDP LSNVDEMDTG ENAGQTPMNI NPQQTRFPDF 360 
LDCLPGTNVD UGTLESEDLI PLFNDVESAL NKSBPFLTWL 

Seq ID NO: 212 DNA sequence 
Nucleic Acid Accession #: NM_018174 

Coding sequence: 176-2194 (underlined sequences correspond to start and stop codons ) 



CATCTCCCCC AACCTGGGGG TCGTGTTCTT CAACGCCTGC GAGGCCGC6T CGCG6CTGGC 60 

GCGCGGCGAG GATGAGGCGG AGCTGGCGCT GAGCCTCCTG GCGCAGCTGG GCATCACGCC 120 

TCTGCCACTC AGCCGCGGCC CCGTGCCAGC CAAACCCACC GTGCTCTTCG AGAAGATGGG 180 

CGTGGGCCGG CTGGACATGT ATGTGCTGCA CCCGCCCTCC GCCGGCGCCG AGCGCACGCT 240 

GGCCTCTGTG TGCGCCCTGC TGGTGTGGCA CCCCGCCGGC CCCGGCGAGA AGGTGGTGCG 300 

CGTGCTGTTC CCCGGTTGCA CCCCGCCCGC CTGCCTCCTG GACGGCCTGG TCCGCCTGCA 360 

GCACTTGAGG TTCCTGCGAG AGCCCGTGGT GACGCCCCAG GACCTGGAGG GGCCGGGGCG 420 

AGCCGAGAGC AAAGAGAGCG TGGGCTCCCG GGACAGCTCG AAGAGAGAGG GCCTCCTGGC 480 

CACCCACCCT AGACCTGGCC AGGAGCGCCC TGGGGTGGCC CGCAAGGAGC CAGCACGGGC 540 

TGAGGCCCCA CGCAAGACTG AGAAAGAAGC CAAGACCCCC CGGGAGTTGA AGAAAGACCC 600 

CAAACCGAGT GTCTCCCGGA CCCAGCCGCG GGAGGTGCGC CGGGCAGCCT CTTCTGTGCC 660 

CAACCTCAAG AAGACGAATG CCCAGGCGGC ACCCAAGCCC CGCAAAGCGC CCAGCACGTC 720 

CCACTCTGGC TTCCCGCCGG TGGCAAATGG ACCCCGCAGC CCGCCCAGCC TCCGATGTGG 780 

AGAAGCCAGC CCCCCCAGTG CAGCCTGCGG CTCTCCGGCC TCCCAGCTGG TGGCCACGCC 840 

CAGCCTGGAG CTGGGGCCGA TCCCAGCCGG GGAGGAGAAG GCACTGGAGC TGCCTTTGGC 900 

CGCCAGCTCA ATCCCAAGGC CACGCACACC CTCCCCTGAG TCCCACCGGA GCCCCGCAGA 960 

GGGCAGCGAG CGGCTGTCGC TGAGCCCACT GCGGGGCGGG GAGGCCGGGC CAGACGCCTC 1020 

ACCCACAGTG ACCACACCCA CGGTGACCAC GCCCTCACTA CCCGCAGAGG TGGGCTCCCC 1080 

GCACTCGACC GAGGTGGACG AGTCCCTGTC GGTGTCCTTT GAGCAGGTGC TGCCGCCATC 1140 

CGCCCCCACC AGTGAGGCTG GGCTGAGCCT CCCGCTGCGT GGCCCCCGGG CGCGGCGCTC 1200 

GGCTTCCCCA CACGATGTGG ACCTGTGCCT GGTGTCACCC TGTGAATTTG AGCATCGCAA 1260 

GGCGGTGCCA ATGGCACCGG CACCTGCGTC CCCCGGCAGC TCGAATGACA GCAGTGCCCG 1320 

GTCACAGGAA CGGGCAGGTG GGCTGGGGGC CGAGGAGACG CCACCCACAT CGGTCAGCGA 1380 

GTCCCTGCCC ACCCTGTCTG ACTCGGATCC CGTGCCCCTG GCCCCCGGTG CGGCAGACTC 1440 

AGACGAAGAC ACAGAGGGCT TTGGAGTCCC TCGCCACGAC CCTTTGCCTG ACCCCCTCAA 1500 

GGTCCCCCCA CCACTGCCTG ACCCATCCAG CATCTGCATG GTGGACCCCG AGATGCTGCC 1560 

CCOCAAGACA GCACGGCAAA CGGAGAACGT CAGCCGCACC CGGAAGCCCC TGGCCCGCCC 1620 

CAACTCACGC GCTGCCGCCC CCAAAGCCAC TCCAGTGGCT GCTGCCAAAA CCAAGGGGCT 1680 

TGCTGGTGGG GACCGTGCCA GCCGACCACT CAGTGCCCGG AGTGAGCCCA GTGAGAAGGG 1740 

AGGCCGGGCA CCCCTGTCCA GAAAGTCCTC AACCCCCAAG ACTGCCACTC GAGGCCCGTC 1800 

GGGGTCAGCC AGCAGCCGGC CCGGGGTGTC AGCCACCCCA CCCAAGTCCC CGGTCTACCT 1860 

GGACCTGGCC TACCTGCCCA GCGGGAGCAG CGCCCACCTG GTGGATGAGG AGTTCTTCCA 1920 

GCGCGTGCGC GCGCTCTGCT ACGTCATCAG TGGCCAGGAC CAGCGCAAGG AGGAAGGCAT 1980 

GCGGGCCGTC CTGGACGCGC TACTGGCCAG CAAGCAGCAT TGGGACCGTG ACCTGCAGGT 2040 

GACCCTGATC CCCACTTTCG ACTCGGTGGC CATGCATACG TGGTACGCAG AGACGCACGC 2100 

CCGGCACCAG GCGCTGGGCA TCACGGTGTT GGGCAGCAAC GGCATGGTGT CCATGCAGGA 2160 

TGACGCCTTC CCGGCCTGCA AGGTGGAGTT CTAGCCCCAT CGCCGACACG CCCCCCACTC 2220 
AGCCCAGCCC GCCTGTCCCT AGATTCAGCC ACATCAGAAA TAAACTGTGA CTACACTTG 

Seq ID NO: 213 Protein sequence: 
Protein Accession #: NP_060644.1 

MGVGRLDMYV LHPPSAGAER TLASVCALLV WHPAGPGBKV VRVLPPGCTP PACLLDGLVR 60 

LQHLRFLREP WTPQDLEGP GRAESKESVG SRDSSKREGL LATHPRPGQE RPGVARKEPA 120 

RAEAPRKTEK EAKTPRBLKK DPKPSVSRTQ PRBVRRAASS VPNLKKTNAQ AAPKPRKAPS 180 

TSHSGFPPVA NGPRSPPSLR CGEASPPSAA CGSPASQLVA TPSLELGPIP AGESKALELP 240 

LAASSIPRPR TPSPESHRSP AEGSERLSLS PLRGGEAGPD ASPTVTTPTV TTPSLPAEVG 300 

SPHSTEVDES LSVSFEQVLP PSAPTSEAGL SLPLRGPRAR RSASPHDVDL CLVSPCEFEH 360 

RKAVPMAPAP ASPGSSNDSS ARSQERAGGL GAEETPPTSV SESLPTLSDS DPVPLAPGAA 420 

DSDEDTEGPG VPRHDPLPDP LKVPPPLPDP SS I CMVDPEM LPPKTARQTE NVSRTRKPIA 480 

RPNSRAAAPK ATPVAAAKTK GLAGGDRASR* PLSARSEPSE KGGRAPI*SRR SSTPKTATRG 540 

PSGSASSRPG VSATPPKSPV YIiDLAYLPSG SSAHLVDEEF FQRVRALCYV ISGQDQRKEE 600 

GMRAVLDALL ASKQHWDRDL QVTLIPTFDS VAMHTWYAET HARHOALGIT VLGSNGMVSM 660 
QDDAFPACKV EP 



Seq ID NO*. 214 DNA sequence 

Nucleic Acid Accession #: NMJ)02019.1 

Coding sequence: 250-4266 (underlined 
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GCGGAGACTC CTCTCGGCTC CTCCCCGGCA GCGGCGGCGG CTCGGAGCGG GCTCCGGGGC 60 

TCGGGTGCAG CGGCCAGCGG GCCTGGCGGC GAGGATTACC CGGGGAAGTG GTTGTCTCCT 120 
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GGCTGGAGCC GCGAGACGGG CGCTCAGGGC GCGGGGCCGG CGGCGGCGAA CGAGAGGACG 180 

GACTCTGGCG GCCGGGTCGT TGGCCGGGGG AGCGCGGGCA CCGGGCGAGC AGGCCGCGTC 240 

GCGCTCAC CA TGG TCAGCTA CTGGGACACC GGGGTCCTGC TGTGCGCGCT GCTCAGCTGT 300 

CTGCTTCTCA CAGGATCTAG TTCAGGTTCA AAATTAAAAG ATCCTGAACT GAGTTTAAAA 360 

GGCACCCAGC ACATCATGCA AGCAGGCCAG ACACTGCATC TCCAATGCAG GGGGGAAGCA 420 

GCCCATAAAT GGTCTTTGCC TGAAATGGTG AGTAAGGAAA 6C6AAAGGCT GAGCATAACT 480 

AAATCTGCCT GTGGAAGAAA TGGCAAACAA TTCTGCAGTA CTTTAACCTT GAACACAGCT 540 

CAAGCAAACC ACACTGGCTT CTACAGCTGC AAATATCTAG CTGTACCTAC TTCAAAGAAG 600 

AAGGAAACAG AATCTGCAAT CTATATATTT ATTAGTGATA CAGGTAGACC TTTCGTAGAG 660 

ATGTACAGTG AAATCCCCGA AATTATACAC ATGACTGAAG GAAGGGAGCT CGTCATTCCC 720 

TGCCGGGTTA CGTCACCTAA CATCACTGTT ACTTTAAAAA AGTTTCCACT TGACACTTTG 780 

ATCCCTGATG GAAAACGCAT AATCTGGGAC AGTAGAAAGG GCTTCATCAT ATCAAATGCA 840 

ACGTACAAAG AAATAGGGCT TCTGACCTGT GAAGCAACAG TCAATGGGCA TTTGTATAAG 900 

ACAAACTATC TCACACATCG ACAAACCAAT ACAATCATAG ATGTCCAAAT AAGCACACCA 960 

CGCCCAGTCA AATTACTTAG AGGCCATACT CTTGTCCTCA ATTGTACTGC TACCACTCCC 1020 

TTGAACACGA GAGTTCAAAT GACCTGGAGT TACCCTGATG AAAAAAATAA GAGAGCTTCC 1080 

GTAAGGCGAC GAATTGACCA AAGCAATTCC CATGCCAACA TATTCTACAG TGTTCTTACT 1140 

ATTGACAAAA TGCAGAACAA AGACAAAGGA CTTTATACTT GTCGTGTAAG GAGTGGACCA 1200 

TCATTCAAAT CTGTTAACAC CTCAGTGCAT ATATATGATA AAGCATTCAT CACTGTGAAA 1260 

CATCGAAAAC AGCAGGTGCT TGAAACCGTA GCTGGCAAGC GGTCTTACCG GCTCTCTATG 1320 

AAAGTGAAGG CATTTCCCTC GCCGGAAGTT GTATGGTTAA AAGATGGGTT ACCTGCGACT 1380 

GAGAAATCTG CTCGCTATTT GACTCGTGGC TACTCGTTAA TTATCAAGGA CGTAACTGAA 1440 

GAGGATGCAG GGAATTATAC AATCTTGCTG AGCATAAAAC AGTCAAATGT GTTTAAAAAC 1500 

CTCACTGCCA CTCTAATTGT CAATGTGAAA CCCCAGATTT ACGAAAAGGC CGTGTCATCG 1560 

TTTCCAGACC CGGCTCTCTA CCCACTGGGC AGCAGACAAA TCCTGACTTG TACCGCATAT 1620 

GGTATCCCTC AACCTACAAT CAAGTGGTTC TGGCACCCCT GTAACCATAA TCATTCCGAA 1680 

GCAAGGTGTG ACTTTTGTTC CAATAATGAA GAGTCCTTTA TCCTGGATGC TGACAGCAAC 1740 

ATGGGAAACA GAATTGAGAG CATCACTCAG CGCATGGCAA TAATAGAAGG AAAGAATAAG 180 0 

ATGGCTAGCA CCTTGGTTGT GGCTGACTCT AGAATTTCTG GAATCTACAT TTGCATAGCT 1860 

TCCAATAAAG TTGGGACTGT GGGAAGAAAC ATAAGCTTTT ATATCACAGA TGTGCCAAAT 1920 

GGGTTTCATG TTAACTTGGA AAAAATGCCG ACGGAAGGAG AGGACCTGAA ACTGTCTTGC 1980 

ACAGTTAACA AGTTCTTATA CAGAGACGTT ACTTGGATTT TACTGCGGAC AGTTAATAAC 2040 

AGAACAATGC ACTACAGTAT TAGCAAGCAA AAAATGGCCA TCACTAAGGA GCACTCCATC 2100 

ACTCTTAATC TTACCATCAT GAATGTTTCC CTGCAAGATT CAGGCACCTA TGCCTGCAGA 2160 

GCCAGGAATG TATACACAGG GGAAGAAATC CTCCAGAAGA AAGAAATTAC AATCAGAGAT 2220 

CAGGAAGCAC CATACCTCCT GCGAAACCTC AGTGATCACA CAGTGGCCAT CAGCAGTTCC 2280 

ACCACTTTAG ACTGTCATGC TAATGGTGTC CCCGAGCCTC AGATCACTTG GTTTAAAAAC 2340 

AACCACAAAA TACAACAAGA GCCTGGAATT ATTTTAGGAC CAGGAAGCAG CACGCTGTTT 2400 

ATTGAAAGAG TCACAGAAGA GGATGAAGGT GTCTATCACT GCAAAGCCAC CAACCAGAAG 2460 

GGCTCTGTGG AAAGTTCAGC ATACCTCACT GTTCAAGGAA CCTCGGACAA GTCTAATCTG 2520 

GAGCTGATCA CTCTAACATG CACCTGTGTG GCTGCGACTC TCTTCTGGCT CCTATTAACC 2580 

CTCCTTATCC GAAAAATGAA AAGGTCTTCT TCTGAAATAA AGACTGACTA CCTATCAATT 2640 

ATAATGGACC CAGATGAAGT TCCTTTGGAT GAGCAGTGTG AGCGGCTCCC TTATGATGCC 2700 

AGCAAGTGGG AGTTTGCCCG GGAGAGACTT AAACTGGGCA AATCACTTGG AAGAGGGGCT 2760 

TTTGGAAAAG TGGTTCAAGC ATCAGCATTT GGCATTAAGA AATCACCTAC GTGCCGGACT 2820 

GTGGCTGTGA AAATGCTGAA AGAGGGGGCC ACGGCCAGCG AGTACAAAGC TCTGATGACT 2880 

GAGCTAAAAA TCTTGACCCA CATTGGCCAC CATCTGAACG TGGTTAACCT GCTGGGAGCC 2940 

TGCACCAAGC AAGGAGGGCC TCTGATGGTG ATTGTTGAAT ACTGCAAATA TGGAAATCTC 3000 

TCCAACTACC TCAAGAGCAA ACGTGACTTA TTTTTTCTCA ACAAGGATGC AGCACTACAC 3060 

ATGGAGCCTA AGAAAGAAAA AATGGAGCCA GGCCTGGAAC AAGGCAAGAA ACCAAGACTA 3120 

GATAGCGTCA CCAGCAGCGA AAGCTTTGCG AGCTCCGGCT TTCAGGAAGA TAAAAGTCTG 3180 

AGTGATGTTG AGGAAGAGGA GGATTCTGAC GGTTTCTACA AGGAGCCCAT CACTATGGAA 3240 

GATCTGATTT CTTACAGTTT TCAAGTGGCC AGAGGCATGG AGTTCCTGTC TTCCAGAAAG 3300 

TGCATTCATC GGGACCTGGC AGCGAGAAAC ATTCTTTTAT CTGAGAACAA CGTGGTGAAG 3360 

ATTTGTGATT TTGGCCTTGC CCGGGATATT TATAAGAACC CCGATTATGT GAGAAAAGGA 3420 

GATACTCGAC TTCCTCTGAA ATGGATGGCT CCCGAATCTA TCTTTGACAA AATCTACAGC 3480 

ACCAAGAGCG ACGTGTGGTC TTACGGAGTA TTGCTGTGGG AAATCTTCTC CTTAGGTGGG 3540 

TCTCCATACC CAGGAGTACA AATGGATGAG GACTTTTGCA GTCGCCTGAG GGAAGGCATG 3600 

AGGATGAGAG CTCCTGAGTA CTCTACTCCT GAAATCTATC AGATCATGCT GGACTGCTGG 3660 

CACAGAGACC CAAAAGAAAG GCCAAGATTT GCAGAACTTG TGGAAAAACT AGGTGATTTG 3720 

CTTCAAGCAA ATGTACAACA GGATGGTAAA GACTACATCC CAATCAATGC CATACTGACA 3780 

GGAAATAGTG GGTTTACATA CTCAACTCCT GCCTTCTCTG AGGACTTCTT CAAGGAAAGT 3840 

ATTTCAGCTC CGAAGTTTAA TTCAGGAAGC TCTGATGATG TCAGATATGT AAATGCTTTC 3900 

AAGTTCATGA GCCTGGAAAG AATCAAAACC TTTGAAGAAC TTTTACCGAA TGCCACCTCC 3960 

ATGTTTGATG ACTACCAGGG CGACAGCAGC ACTCTGTTGG CCTCTCCCAT GCTGAAGCGC 4020 

TTCACCTGGA CTGACAGCAA ACCCAAGGCC TCGCTCAAGA TTGACTTGAG AGTAACCAGT 4080 

AAAAGTAAGG AGTCGGGGCT GTCTGATGTC AGCAGGCCCA GTTTCTGCCA TTCCAGCTGT 4140 

GGGCACGTCA GOGAAGGCAA GCGCAGGTTC ACCTACGACC ACGCTGAGCT GGAAAGGAAA 4200 

ATCGCGTGCT GCTCCCCGCC CCCAGACTAC AACTCGGTGG TCCTGTACTC CACCCCACCC 4260 

ATCTAGAGTT TGACACGAAG CCTTATTTCT AGAAGCACAT GTGTATTTAT ACCCCCAGGA 4320 

AACTAGCTTT TGCCAGTATT ATGCATATAT AAGTTTACAC CTTTATCTTT CCATGGGAGC 4380 

CAGCTGCTTT TTGTGATTTT TTTAATAGTG CTTTTTTTTT TTGACTAACA AGAATGTAAC 4440 

TCCAGATAGA GAAATAGTGA CAAGTGAAGA ACACTACTGC TAAATCCTCA TGTTACTCAG 4500 

TGTTAGAGAA ATCCTTCCTA AACCCAATGA CTTCCCTGCT CCAACCCCCG CCACCTCAGG 4560 

GCACGCAGGA CCAGTTTGAT TGAGGAGCTG CACTGATCAC CCAATGCATC ACGTACCCCA 4620 

CTGGGCCAGC CCTGCAGCCC AAAACCCAGG GCAACAAGCC CGTTAGCCCC AGGGGATCAC 4680 
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TGGCTGGCCT 
AAGGAAAAAA 
TTTGAGACGC 
TTCCCAGCTC 
GGGGACATTT 
GCAAATTTTA 
TTTGATTTGT 
TAAGATGCAC 
TGAAAATGTC 
TTGGTATTAT 
AGTCCTGAGG 
AGGTCAAGGG 
CCAAAACACA 
ACTAATCTGA 
TGAGTAAAAA 
TAGTTAATGA 
GGTCTGAGCA 
TAAAGATCAA 
TAGGGTCTAT 
GATCGCTAAG 
TGTCTGCACC 
GGGGAGAAGA 
TAACCGAAGA 
ATGAAGAGAT 
GCTTTTGTGG 
CTAAATCCAA 
TCTTTACACA 
AGGAGGTTAA 
TCCAAATCAG 
CTTGATTTCA 
AGAAAAGCAA 
ATGAGTCCAT 
TGGAGACTTG 
TTGAAACATG 
GCACTAACAA 
CTGGGTGGAA 
ACCGAGATGT 
CAAACAGATA 
CAGTGGTGTA 
GGTGTATGTG 
TACAAACCAA 
GTCATGATGA 
TTGGGATTTG 
CTCCTTCCAT 
AAATAAACTC 
GTCTCAAAAA 
TGGGGTTTTC 
CCAATAATTC 
CTATGGATAT 
CTATATATTC 



GAGCAACATC 
AGCAAAAAGC 
ACCATGTGGG 
TGACCCTTCT 
TCTGGATTCT 
GACCTTTACC 
AGCACTGAGG 
TGAAAACTTA 
ACATTCTATT 
TCTGTTTTGC 
AGAGTTTTCT 
AAGACCCCGT 
GGAAGTCAGT 
AAGGATGTGG 
GGTGGTATGT 
GCCATCACTA 
TGATGGGAAT 
GTGGGCCTTG 
GTATTTAGGA 
CTGGCTCTGT 
TTCTG CAGCC 
GTATGCTTCC 
ATGTATGCCT 
GGGACCGTCA 
AAGACTCACT 
ACAAAAGCAG 
TACGCAAACC 
ACTCAGAAAA 
ATAATAGCCC 
ATAATTAATT 
AACCATTAGA 
CCATCAGTCA 
TAATAATGAG 
AATTAACTGA 
AGAACGAGCA 
TGGGGCTGAA 
TAATTTTAGG 
CTCGCTAGCC 
ACTGTGTGTG 
TGTTTTGTGC 
GAATATATGC 
ATGTATTTTG 
TAATCGTACC 
AAATTTTTCA 
AAATTTATCC 
ATTGCTAAAT 
GAACCTTTCA 
CTGTCCATGA 
TGGCTAGTTT 
TCTGCTCTTT 



TCGGGAGTCC 
AAGGGAGAAA 
CACGGAGGGG 
ACATTTGAGG 
GGGAGGCAAG 
TATGGAAGTG 
GTGGCACTCA 
GCCAGAGTTA 
TTGGGTATTA 
ACAGTTAGTT 
CCATATCAAA 
CTCTATACCA 
CACGTTTCCT 
AAGAGCATTA 
AATTTATGCA 
GAAGAAAAGC 
AGGGAGACAG 
GATCGCTAAG 
TGCGCCTACT 
TTGATGCTAT 
AGTCAGAAGC 
TTTTATCCAT 
CTGTTCTTAT 
TCAGCACATT 
AGCCAGAAGA 
GCTAGAGCCA 
ACCTGTGACA 
AAGAAGACCT 
AGCAAATAGT 
CTTAATCATT 
ATTGTTACTC 
AAGAATGGTT 
CTAGTTACAA 
TAATATTCCA 
CTTCCTTTCA 
ACCATGTGCA 
GACCCGTGCC 
TCATTTAAAT 
TGTGTGTGTG 
ATAACTATTT 
TACAGATATA 
TATACCATCT 
AACTTAATTG 
AAATACTAAT 
TTGTTTAGAG 
ATTTTCAATG 
CTTTTTGTTT 
AAATGCAAAT 
TGCCTTTATT 
GTATTCTCCT 



TCTAGCAGGC 
AGAGAAACCG 
GACGGGGCTC 
GCCCAGCCAG 
AAAAGGACAA 
GTTCTATGTC 
ACTCTGAGCC 
GGTTGTCTCC 
ATATATAGTC 
GTGAAAGAAA 
ACGAGGGCTG 
ACCAAACCAA 
TTTCATTTAA 
GCTGGCGCAT 
AGGTATTTCT 
CCATTTTCAA 
GGTAGGAAAG 
CTGGCTCTGT 
CTTCAGGGTC 
TTATGCAAGT 
TGGAGAGGCA 
GTAATTTAAC 
GTGCCACATC 
CCCTAGTGAG 
GAGGAGTGGG 
GAAGAGAGGA 
GCTGGCAATT 
CAGTCAATTC 
GATAACAAAT 
AAGAGACCAT 
AGCTCCTTCA 
CCATCTGGAG 
AGTGCTTGTT 
ATCATTTGCC 
GAGTTTCTGA 
AGTCTGTGTC 
TTGTTTCCTA 
TGATTAAAGG 
TGTGTGTGTG 
AAGGAAACTG 
AGACAGACAT 
TCATATAATA 
ATAAACTTGG 
TCAACAAAGA 
CAGAGAAAAA 
GAAAACTAAA 
GTTTTACCTA 
TATCCAGTGT 
AAGCAAATTC 
TTGAACCCGT 



CTAAGACATG 
GGAGAAGGCA 
AGCAATGCCA 
GAGCAGATGG 
ATATCTTTTT 
CATTCTCATT 
CATACTTTTG 
AGGCCATGAT 
CAGACACTTA 
GCTGAGAAGA 
ATGGAGGAAA 
TTCACCAACA 
TGGGGATTCC 
ATTAAGCACT 
CCAGTTGGGA 
CTGCTTTGAA 
GGCGCCTACT 
TTGATGCTAT 
TAAAGATCAA 
TAGGGTCTAT 
ACAGTGGATT 
TGTAGAACCT 
CTTGTTTAAA 
CCTACTGGCT 
ACAGTCCTCT 
CAAATCTTTG 
TTATAAATCA 
TCTACTTTTT 
AAAACCTTAG 
AATAAATACT 
AACTCAGGTT 
TCTTAATGTA 
CATTAAAATA 
ATTTATGACA 
GATAATGTAC 
TTGTCAGTCC 
GCCCACAAGA 
AGGAGTGCAT 
TGTGTGTGTG 
GAATTTTAAA 
GGTTTGGTCC 
TACTTAAAAA 
CAACTGCTTT 
AAAAGCTCTT 
TTAAGAAAAA 
TGTTAGTTTA 
TTTCACAACT 
AGATATATTT 
ATTTCAGCCT 
TAAAACATCC 



TGAGGAGGAA 
TGAGAAAGAA 
TTTCAGTGGC 
ACAGCGATGA 
TGGAACTAAA 
CGTGGCATGT 
GCTCCTCTAG 
GGCCTTACAC 
ACTCAATTTC 
ATGAAAATGC 
AAGGTCAATA 
CAGTTGGGAC 
ACTATCTCAC 
TTAAGCTCCT 
CTCAGGATAT 
ACTTGCCTGG 
CTTCAGGGTC 
TTATGCAAGT 
GTGGGCCTTG 
GTATTTAGGA 
GCTGCTTCTT 
GAGCTCTAAG 
GGCTCTCTGT 
CCTGGCAGCG 
CCACCAAGAT 
TTGTTCCTCT 
GGTAACTGGA 
TTTTTTTTTT 
CTGTTCATGT 
CCTTTTCAAG 
TGTAGCATAC 
GAAAGAAAAA 
GCACTGAAAA 
AAAATGGTTG 
GTGGAACAGT 
AAGAAGTGAC 
ATGCAAACAT 
CTTTGGCCGA 
TGTGGGTGTG 
GTTACTTTTA 
TATATTTCTA 
TATTTCTTAA 
TATGTTCTGT 
TTTTTTCCTA 
CTTTGAAATG 
GCTGATTGTA 
GTGTAAATTG 
GACCATCACC 
GAATGTCTGC 
TGTGGCACTC 



4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 
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6900 
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7080 
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7200 
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7500 
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7620 
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MVSYWDTGVL 
WSLPEMVSKB 
ESAIYIPISD 
GKRIIWDSRK 
KLLRGHTLVL 
MQNKDKGLYT 
AFPSPEWWL 
TLIVNVKPQI 
DFCSNNEBSF 
VGTVGRNISP 
HYSISKQKMA 
PYLLRNLSDH 
VTEBDBGVYH 
RKMKRSSSEI 
WQASAFGIK 
QGGPLMVTVE 
TSSESFASSG 
RDLAARNILL 
DVWSYGVLLW 



11 
I 

LCALLSCLLL 
SERLSITKSA 
TGRPFVEMYS 
GFIISNATYK 
NCTATTPLNT 
CRVRSGPSPK 
KDGLPATEKS 
YEKAVSSFPD 
ILDADSNMGN 
YITDVPNGPH 
ITKEHSITLN 
TVAISSSTTL 
CKATNQKGSV 
KTDYLSIIMD 
KSPTCRTVAV 
YCKYGNXiSNY 
PQEDKSLSDV 
SENNWKICD 
BIFSLGGSPY 



21 
I 

TGSSSGSKLK 
CGRNGKQPCS 
EIPEIIHWTE 
EIGLLTCEAT 
RVQMTWSYPD 
SVNTSVHIYD 
ARYLTRGYSL 
PALYPLGSRQ 
RIESITQRMA 
VNLBKMPTEG 
LTIMNVSLOD 
DCHANGVPEP 
ESSAYLTVQG 
PDEVPLDEQC 
KMLKEGATAS 
LKSKRDLFFIj 
EEEEDSDGFY 
FGLARDIYKN 
PGVQMDEDFC 



31 
I 

DPELSLKGTQ 
TLTLNTAQAN 
GRELVIPCRV 
VNGHLYKTNY 
EKKKRASVRR 
KAPITVKHRK 
IIKDVTEEDA 
ILTCTAYGIP 
IIEGKNKMAS 
EDLKLSCTVN 
SGTYACRARN 
QITWFKNNHK 
TSDKSNLELI 
ERLPYDASKW 
EYKALMTBLK 
NKDAALHMEP 
KEPITMEDLI 
PDYVRKGDTR 
SRLREGMRMR 



41 

I 

HIMQAGQTLH 
HTGFYSCKYL 
TSPNITVTLK 
LTHRQTOTII 
RIDQSNSHAN 
QQVLETVAGK 
GNYTILIiSIK 
QPTIKWFWHP 
TLWADSRIS 
KFLYRDVTWI 
VYTGEEILQK 
IQOEPGIILG 
TLTCTCVAAT 
EFARERIiKLG 
ILTHIGHHLN 
KKEKMEPGLE 
SYSPQVARGM 
LPLKWHAPES 
APEYSTPEIY 



51 
1 

LQCRGEAAHK 
AVPTSKKKET 
KFPLDTL.IPD 
DVQISTPRPV 
IPYSVLTIDK 
RSYRLSMKVK 
QSNVFKNLTA 
CNHNHSEARC 
GIYICIASNK 
LLRTVNNRTM 
KEITIRDQEA 
PGSSTLPIBR 
LFWLLLTLLI 
KSLGRGAFGK 
WNLLGACTK 
QGKKPRIiDSV 
EFLSSRKCIH 
IFDKIYSTKS 
QIMLDCWHRD 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
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PKERPRFAEL VEKLGDLIiQA NVQQDGKDYI PINAILTGNS GFTYSTPAFS EDFFKESISA 1200 

PKFNSGSSDD VRYVNAFKFM SLERIKTFEE LLPNATSMFD DYQGDSSTLL ASPMLKRFTW 1260 

TDSKPKASLK IDLRVTSKSK ESGLSDVSRP SFCHSSCGHV SEGKRRFTYD HAELERKIAC 1320 
CSPPPDYNSV VLYSTPPI 



Seq ID NO: 216 DNA sequence 

Nucleic Acid Accession #: NM_024689 

Coding sequence: 76-624 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

CTCTTTGGCC AAGCCCTGCC TCTGTACAGC 
AGCCCAGAGC CCAAGATGGA 6CCCCAGCTG 
TGGCTGGCCC TGCTGCTGTG GGTCTCAGCC 
TCCCTTTCTT CTCTGGTGCC CCAAGTCAGA 
GGTCTTGATA AATGCAATGC CTGCATCGGG 
GAAATAAGAT CTGACAACTG GCTGGCTTCC 
TCTTATCCTQ CAAATTACTC AGATGATTCC 
CTGGTCAGCA AATATCAAAA CGAGATCTCA 
CCAAAGACCT GCAGCATTGA GCGTGTCCTG 
CAGGCCAAGC GCCTCACGCC GGACCTGGTG 
AAGTTCCTGT GTATGCTGAG ATAACACCAG 
AGAACTTCCA GAAAGTGTTA GCCTTCTCCC 
AGTAATCATT AAAGAGGCTT CTGCATCAAA 
AGAATTCACC AACACACAGG CCCACCAGCA 
GACAACTCCA AAGCCCCGGC TCTTTCCACC 
TGAGCCCACC CCAATCCAGA TGTGATCCCC 
CTGGACAGGT CTTCCCTATG AGATAGAACC 
TTACCAAAGG CCCACATAAC TTCTAAATTT 
CCCTAGTGAT GGATGAACTC TCTTATCTCT 
CTTTTACTTT TTAAGTACCT CCATCAGAGT 
TTTATGTTTC CATTCTGGTA AGAACTCTTT 
TTTTTTGAGC AAACACTCGG GGGTATGGAT 
GGGGAAGCTA CTTCTCCTCT ATTCAGATTT 
TAGATTTCGG TCTTCATTGC TGTCCATTTT 
GTGTATCAGG CAGGGTTCTA CCAGAGAAAC 
AGATTTATTT CAAAGAATTG ATTTACATGA 
TGGTAGGCCT GCAATCTGTA AACCTTTGGG 
ATTCCTTGTT CCTTAAAAAA ATCTGTTTTT 
GCCCACCCAG ATTACCTAGA TAATCTCTTT 
TCACATCTAT GAAATGCCTT CACAGCAACA 
GAATACAGCC TAGCCAAGTT GACACATAAA 
TTTTATCGAC CGTCTTCAGA CTGTTAAGGA 
AGCATCACCC TGAACCAAAG GCCCCTATCA 
CAATAGAGAC ATTGACTGGT TGGCTGGCTT 
AGGATGAGGA AACCAGGCAC GGGAGAGGGA 
TTTTTATTTT TCACTGGGAG GTGGTAAGTT 
AAGTGATTTA GAAACTCCAA AGCAATTGGT 
ATGAAACCTT ATTTTATTGG AAATGGTTGG 
TAATTGTGGG TTTGCACATG GCCAGCACAT 
TGTAAGTGGG ACCTTGGGGA GGAGCTGCCT 
GTCTCTTAAG CCTGTTCCTG CTACAGTTAT 
TAGCAGTTAT CTATTGTTGT GTATTAAACC 



31 41 51 

I I I 

CTCGAGTGGA CAGCCAGAGG CTGCAGCTGG 60 

GGGCCTGAGG CTGCCGCCCT CCGCCCTGGC 120 

CTGAGCTGTT CTTTCTCCTT GCCAGCTTCT 180 

ACCAGCTACA ATTTTGGAAG GACTTTCCTC 240 

ACATCTATTT GCAAGAAGTT CTTTAAAGAA 300 

CACCTTGGAC TGCCTCCCGA TTCCTTGCTT 360 

AAAATCTGGC GCCCTGTGGA GATCTTTAGA 420 

GACAGGAAAA TCTGTGCCTC TGCATCAGCC 480 

CGGAAAACAG AGAGGTTCCA GAAATGGCTG 540 

CAGGACTGTC ACCAGGGCCA GAGAGAACTA 600 

TGAAAAAGCC TGGCATGGAG CCCAGCACTG 660 

AACTGTGTTA TACCAACCAC ATTTTCAAAT 720 

CCTTCACATG CAGCTCCCAT GCCACCCTCC 780 

ACAGGCTACC TTTGCACAAT ATTCTCTGAT 840 

ACACTGTGGT CCCCTAGATG GGGCTGTTGC 900 

CTGTGATCTA CTTCTGGCAA GATTCTCAGT 960 

TGATAAGGAG CTAGGGCAAT TCTGACAACA 1020 

TGGTCTGGTC TGAAGGAAAA CCTGTTCTCG 1080 

GGCTTCTAGA GGGAAAAAAA AAGCATACCT 1140 

CATGAAATCA CCTGTCAAGA CTATCTATCT 1200 

AAATGAGGAC ACTGCTGATT GCTGGTGATG 1260 

GAAAGCCAAT CGCAGGTCAA ATGACTCCTT 1320 

CACTAAAATC TTCCAAGATG AAAGCAAATC 1380 

TGTAATGAAC GAGTGTTTTT CCTTTAGCTA 1440 

AGAACCAGTA GGAGATACAT ATACATGTCC 1500 

TTGTGGGGAT TGGCAAGTCG AAAATCCATA 1560 

CAGGAGCTGA TGCTGTAGTT TGCAGATAGA 1620 

GTTCTTAAGG GCTTTGAATX3 ATTGGATCAG 1680 

TACTTAAAGT AAACTGATTG TAGGTGCTAA 1740 

CCTAGATTAG CATTCAATTG AATAACTGGG 1800 

ATTAACCATC ACAGCAACAT GCCTGCTAAA 1860 

TTGTGGTAGA GAACTGTGAC AGCCACTCTC 1920 

AGTAACAATA TAGCCAAGCA AAATTCCAGT 1980 

CCCAAGGGAT AGCACCAGAC AAGAAATGCA 2040 

GGGGCAACAG AGGTCCAGGG TTTGGTTATC 2100 

AGCCCTGTTG CCCATGTATG CAGATGGGAG 2160 

AATCCCCAAA ATGGGTGTAT CTGGTTTGAA 2220 

TTTCCCAATT CTGTTTGCCA TTGGCCAATA 2280 

GCCAAACAGA AGTAGACAAA GGTCTCACTC 2340 

CCATCATAAA GGGAGGGGTT AGTAAAAATG 2400 

AGAGGTTGCT CAGAACCTTC TCAGCAAATA 2460 
ATTTCAACAC AT 



Seq ID NO: 217 Protein sequence: 
Protein Accession #: NPJ>78965.1 

1 11 21 31 41 51 

I I I I I I 

MEPQLGPEAA ALRPGWLALL LWVSALSCSF SLiPASSLSSL VPQVRTSYNF GRTFLGLDKC 60 
NACIGTSICK KFFKEEIRSD NWLASHLGLP PDSLLSYPAN YSDDSKJWRP VEIPRLVSKY 120 
QNEISDRKIC ASASAPKTCS IERVLRKTER PQKWLQAKRL TPDLVQDCHQ GQRELKFLCM 180 



Seq ID NO: 218 PNA ^eq u^nee 

Nucleic Acid Accession #: AF075027.1 

Coding sequence: 3-269 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

GA TTAAT TAA GTGCTTTAAA CGGTCTTGGT AAATATTCCG CGGGAGCTGG GGAGGACCGT 60 

TGGGATGGCT GTAGCTTGAG TTGAATTTTA ACTGTCCTCA TTCTGGGTTT TGTCGCTCTG 120 

CTTTCTGTGC CAAGGTGCTG TGTTACGGGA GAGAGTGACT GGAAAGTAAC AAAGCTGAAT 180 

CTTTCTCCCT GGAGTAAGGC CGAAGACTGG ATTACTACAC GCCTAGACGT GACACTACAC 240 

CCATAGATCT CATGCATCAT TAATGCCA TA TGA CATTGCC ATTTTCTTTC TCAGTTCACG 300 

GACAAAAGTG GTGGGTTTTC ATTGTCTTCA CTGATTGTCA ATGCATTAAT AAAGAAGATG 360 
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TGTGGT 



Seq ID NO: 219 Protein sequencer 
Protein Accession #: AF075027 



1 11 21 31 41 51 

I I 1 I I I 

ERKWQCHMAIi MMHEIYGCSV TSRKWIQSS AIiLQGERFSF VTFQSLSPVT QHLGTESRAT 60 
KPRMRTVKIQ LKLQPSQRSS PAPAEYLPRP FKALN 

Seq ID NO: 220 DNA sequence 

Nucleic Acid Accession #: AL133411.8 

Coding sequence: 1-1395 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

&2GGGCAAGG ACTTCATGAC TAAAACACTA AAAGCAATGG CAACAAAAGC CAAAATTGAC 60 

AAATGGGATC TAATCAAATT AAAGAGCTTC CGCACAGCAA AAGAAACTAT TATCAGAGTG 120 

AACA6GCAAC CTACAGAATG GGA6AAAAAT TTT6CAATGT ATCCATCTGA CAAAGGGCTG 180 

ACATCCAGAA" TCTATAAG6A ACTTAAACAA TTTTACAAGA AAAAACCAAA CAACGCCATC 240 

AAAAAG6ACA TGGATGAAGC TGGAAACCGT CATTCTCAGA AAACTAACAC AGGAACAGAA 300 

AACCAAACAC CACATGTTCT CACTCATAAG TGGGAGTTGA ACAATGAGAA CACATGGACA 360 

CAGGGAGGGG AACATCACAC ACTGGGGCCT GTCAGAAGCC CCTCTGGCCT CCTGGCTGGC 420 

CTTGAACATG CTGGGAGGAA ATTACAATTC. ATCCATGGGC TGTTTACCCT TGAAAATGAA 480 

TGGGCCCAGG AACAATCCAT AATACAAAAG AAATATGCAT TATGGATTGG AACCAAGCAG 540 

ATCTGGGTGG CACAAACTCC TGGTGAATCT ATCTCCAGTT CACCAGCATT GCCTAATGTG 600 

CTACCTTTAA ATGAAGATGT TAATAAGCAG GAAGAAAAGA ATGAAGATCA TACTCCCAAT 660 

TATGCTCCTG CTAATGAGAA AAATGGCAAT TATTATAAAG ATATAAAACA ATATGTGTTC 720 

ACAACACAAA ATCCAAATGG CACTGAGTCT GAAATATCTG TGAGAGCCAC AACTGACCTG 780 

AATTTTGCTC TAAAAAAOGA TAAAACTGTC AATGCAACTA CATATGAAAA ATCCACCATT 840 

GAAGAAGAAA CAACTACTAG CGAACCCTCT CATAAAAATA TTCAAAGATC AACCCCAAAC 900 

GTGCCTGCAT TTTGGACAAT GTTAGCTAAA GCTATAAATG GAACAGCAGT GGTCATGGAT 960 

GATAAAGATC AATTATTTCA CCCAATTCCA GAGTCTGATG TGAATGCTAC ACAGGGAGAA 1020 

AATCAGCCAG ATCTAGAGGA TCTGAAGATC AAAATAATGC TGGGAATCTC GTTGATGACC 1080 

CTCCTCCTCT TTGTGGTCCT CTTGGCATTC TGTAGTGCTA CACTGTACAA ACTGAGGCAT 1140 

CTGAGTTATA AAAGTTGTGA GAGTCAGTAC TCTGTCAACC CAGAGCTGGC CACGATGTCT 1200 

TACTTTCATC CATCAGAAGG TGTTTCAGAT ACATCCTTTT CCAAGAGTGC AGAGAGCAGC 1260 

ACATTTTTGG GTACCACTTC TTCAGATATG AGAAGATCAG GCACAAGAAC ATCAGAATCT 1320 

AAGATAATGA CGGATATCAT TTCCATAGGC TCAGATAATG AGATGCATGA AAAOGATGAG 1380 
TCGGTTACCC GGTGA 



Seq ID NO: 221 Protein sequence: 
Protein Accession #: AL133411.8 



1 11 21 31 41 51 

I I I I I I 

MGKDFMTKTL KAMATKAKID KWDLIKLKSF RTAKETIIRV NRQPTEWEKN FAMYPSDKGL 60 

TSRIYKELKQ FYKKKPNNAI KKDMDEAGNR HSQKTNTGTE NQTPHVLTHK WELNNENTWT 120 

QGGBHHTLGP VRSPSGLLAG LEHAGRKLQF IHGLFTLENE WAQEQSIIQK KYALWIGTKQ 180 

IWVAQTPGES ISSSPALPNV LPLNEDVNKQ EEKNBDHTPN YAPANEKNGN YYKDIKQYVF 240 

TTQNPNGTES BISVRATTDL NFALKNDKTV NATTYEKSTI EEETTTSEPS HKNIQRSTPN 300 

VPAFWTMLAK AINGTAWMD DKDQLFHPIP . ESDVNATQGE NQPDIiEDLKI KIMLGISLMT 360 

LLLFWLLAF CSATLYKLRH LSYKSCBSQY SVNPBLATMS YFHPSEGVSD TSFSKSAESS 420 
TFLGTTSSDM RRSGTRTSBS KIMTDIISIG SDNEMHENDE SVTR 



Seq ID NO: 222 DNA sequence 

Nucleic Acid Accession #: AL050295.1 

Coding sequence: 237-2073 (underlined sequences correspond to start and stop codons) 



1 11 21 

I I I 

GAAGGGGACA GAAGGCAGTT CACCTCTGCT 
AGCATTTGAA GTCTGGTCTT GTGAAACCCC 
TGCCCTCGAG GTACACCTCA CCTGAGAGGG 
TTTTAGAAGC CTGAAAACTC CAGAAGAGAA 
AATCCCCAAG GAGAACCACT TTGTGCCTCA 
CACTGAACTG GAATTACGAG TCTACTATTC 
CTGGTGAAGA GGCACTGAGG CAAAAACGAG 
AATACACTGT TAATATTGAG ATCAGTTTTG 
CCTACTTGAA CAGCCTCAGT TTTCCAATTC 
TTTTGAGCAT AAATGTGACA ACAGTCTGCA 
GCGAGACAGG TTATGGGTGG CCTCGGGAAA 
GTGAGGTCTT CCTCCCAGGG CACCATTGCA 



31 41 51 

I I I 

CCCGACAGCC TGGGAACCCG CAAGAGCCCC 60 

ACCCTCCTCT GGCTGTGTGA TTGAATGGGA 120 

TTTTGGGCAG ATCAGCAGTA AGGTGTTAAA 180 

AGGCCAACCA ACTCAAACTT GAAGACATGA 240 

TGTTTATTGT GATTTATTCT TCCAAAGCTG 300 

ATCCTTTGAG TCTTCATGAA CATGAACCAG 360 

CCGTTGCCAC AAAAAGTCCT ACGGCTGAAG 420 

AAAATGCATC CTTCCTGGAT CCTATCAAAG 480 

ATGGGAATAA CACTGACCAA ATTACTGACA 540 

GACCTGCTGG AAATGAAATC TGGTGCTCCT 600 

GGTGTCTTCA CAATCTCATT TGTCAAGAGC 660 

GTTGCCTTAA AGAACTGCCT CCCAAT3GAC 720 
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10 



15 



20 



25 



CTTTTTGCCT 
TTCAAGAAGA 
AAACAGCGTT 
GGTTCAAGTC 
AGTTAATACA 
ACTACAACTC 
TCATCTTTGA 
TGTCTT6GCG 
ACACCGCACT 
CAGGTGATGC 
AGAAGAAAAT 
GCGACAACAA 
TAGAATGGAA 
CTAGCTGCAG 
6AACAACA6T 
ACATAAAAGT 
TTTCTGAGGG 
TTTATTGGAA 
ATCTTGATGG 
CCTATCACTG 
TTCACCCGCT 
GCAGTGGTTC 
TCCATATGGG 



GCTTCAGGAA 
CCTCATGAAC 
CCGGAAGGGT 
TGGAAGTGTG 
TAAAGCCAAT 
CTTTCAAGCA 
AGGGGACACA 
CTATGAAGAA 
TTTCAACAAC 
AGGTGAATAT 
AGATGTTATG 
TCCTGTATCT 
GCAGGAAGGA 
CAGATACACC 
CATCTACACT 
GACATTCATC 
ACAAAACTTT 
CACTTCTGCT 
AGCAGAATCA 
CATATTTAGA 
GCCTCTAAAG 
CCATCACATC 
TTCCTCATCC 



GATGTTACCC 
ACTTCCTCCG 
TACGGAATTT 
GTTGTGACAT 
GAACAAGTTG 
GTTACTATCA 
GTCAGTCTGG 
CAGCAGTTGG 
ATGACTTCGG 
GTTTGCAAAC 
CCCATCCAAA 
TTGAACTGCT 
AAAATAAATA 
CTCAAGGCTG 
TGTGAGTTCA 
TCTGTGGCCA 
TCTATAAAAT 
GGAATTAAAA 
GTACTGACAG 
TATAAGAATT 
CTGAACATCA 
AAGTGCTGCA 
CTTCCTGCTG 



TGAACATGAG 
CCCTCTATAG 
TACCAGGCTT 
ATGAAGTCAA 
TACAGAGCCT 
ATGAAAGCAA 
TGTGTGAAAA 
AAATCCAGAA 
TGTCCAAGCT 
TGATATTAGA 
TTTTGGCAAA 
GCAGTCAGGG 
TTCCAGGAAC 
ATGGAACCCA 
TCAGTGCCTA 
ATCTAACAAT 
GCATCAGTGA 
TATACCAAAG 
TCAAGACCTC 
CATACAGTAT 
TGATTGATCC 
TAGAGGAGGA 
TAAAAAAAAA 



AGTCAGACTA 
GTCCTACAAG 
CAAGGGCGTG 
GACTACACCA 
CAATCAGACC 
TTTCTTTGTC 
GGAAGTTTTG 
CAGCAGCAGA 
CACCATCCAC 
CATTTTTGAA 
TGAAGAAATG 
TAATGTTAAT 
CCCTGAGACA 
GTGCCCAAGC 
TGGAGCCAGA 
AACCCCGGAC 
TGTGAGTAAC 
ATTTTATACC 
GACCAGGGAG 
TGCAACCAAA 
TTTGGAAGCT 
TGGAGACTAC 
AAAAAAAAAA 



AATGTAGGCT 
ACCGACTTGG 
ACTGTGACAG 
CCATCACTTG 
TACAAAATGG 
ACACCAGAAA 
TCCTCCAATG 
TTCTCGATTT 
AACATCACTC 
TATGAGTGCA 
AAGGTGATGT 
TGGAGCAAAG 
GACATAGATT 
GGGTCGTCTG 
GGCAGTGCAA 
CCAATTTCTG 
TATGATGAGG 
ACGAGGAGGT 
TGGAATGGAA 
GACGTCATTG 
ACTGTTTCAT 
AAAGTTACTT 
A 



780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



Seq ID NO: 223 Protein .sequence : 
Protein Accession &: CAB43394.1 



30 



35 



40 



MKSPRRTTLC 
EEYTVNIEIS 
SCETGYGWPR 
GFQEDLMNTS 
LELIHKANBQ 
NVSWRYEEQQ 
CKKKIDVMPI 
DSSCSRYTLK 
SVSEGQNFSI 
GTYHCIFRYK 
TFHKGSSSLP 



11 

I 

LMFIVIYSSK 
FENASFLDPI 
ERCLHNLICQ 
SALYRSYKTD 
WQSLNQTYK 
LEIQNSSRFS 
QILANEEMKV 
ADGTQCPSGS 
KCISDVSNYD 
NSYSIATKDV 
AVKKKKKK 



21 
I 

AALNWNYBST 
KAYLNSLSFP 
ERDVFLPGHH 
LETAFRKGYG 
MDYNSFQAVT 
IYTAIiFNNMT 
MCDNNPVSLN 
SGTTVIYTCE 
EVYWNTSAGI 
IVHPLPLKLN 



31 
I 

XHPLSLHEHE 
IHGNNTDQIT 
CSCLKELPPN 
ILPGFKGVTV 
INESNFFVTP 
SVSKLTIHNI 
CCSQGNVNWS 
FISAYGARGS 
KIYQRFYTTR 
IMIDPIiEATV 



41 

I 

PAGEEALRQK 
DILSINVTTV 
GPFCLLQEDV 
TGFKSGSWV 
EIIFEGDTVS 
TPGDAGEYVC 
KVEWKQEGKI 
ANIKVTFISV 
RYLDGAESVL 
SCSGSHHIKC 



51 
I 

RAVATKSPTA 
CRPAGNEIWC 
TLNMRVRLNV 
TYEVKTTPPS 
LVCEKEVLSS 
KLILDIFEYE 
NIPGTPETDI 
ANLTITPDPI 
TVKTSTREWN 
CIEEDGDYKV 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



45 



50 



55 



60 



65 



70 



75 



Seq ID NO: 224 DNA sequence 

Nucleic Acid Accession #: NM_007268 

Coding sequence: 46-1245 (underlined sequences correspond to start and stop codons) 



1 

I 

GGTAGCAGGA 
GGCCTGCTAC 
CCAGAGAGTG 
CTGCAAGGCT 
ACCATCTTTC 
CTGCATGTGA 
GATGACCGGA 
GTGAGAGATA 
ACAACTGGCA 
CAGGCTCGGG 
GAACCCATCA 
TCAGGCTCCT 
GTGAAGTTTG 
ACCATGACAT 
GACATGGATG 
GCCATCATCC 
CTCTGTCGGA 
GAGGCCAACG 
GATGAGCCAA 
GAGTACCAGA 
CTGGATTATG 
CCAGGATCTG 
GCTACCTCTC 
TCACTGGCTT 
GCTCTGGGCC 
GGGAAGATGC 
GCATCTTGCC 
TATCCAGGAT 
ACAGGCCAGG 



11 

I 

GGCTGGAAGA 
TCCTGGGGCA 
TAACAGGACC 
ACACCCAAGT 
TACGTGACTC 
GCCACAAGGT 
GCCACTACAC 
AGATTACTGA 
GCGGTTATGG 
GTTCTCCTCC 
AAGTAGCAAC 
ATTTCTGCAC 
TGGTCAAAGA 
ACCCCTTGAA 
GCTACCTTGG 
TCATCATCTC 
AGACATCCCA 
ACTCTGGAGA 
CTTCCCAGAA 
TCATCGCCCA 
AGTTTCTGGC 
CTGACATAAT 
TTCCTGGATA 
TGCCCTGGAA 
CTTCTAGTAT 
CCATAGCACT 
ACCAGAAGAC 
CATTTCTCTT 
GTTCAGTTCT 



21 
I 

AAGGACAGAA 
CCTAACAGTG 
TTGGAAAGGG 
CTTGGTGAAG 
TTCTGGAGAC 
TCCAGGAGAT 
GTGTGAAGTC 
GCTCCGTGTC 
CTTCACGGTG 
CATCAGTTAT 
CCTAAGTACC 
TGCCAAGGGC 
CTCCTCAAAG 
AGCAACATCT 
AGAGACCAGT 
CTTGTGCTGT 
ACAAGAGCAT 
AACCATGAGG 
TCTGGGCAAC 
GATCAATGGC 
CACTGAGGGC 
TGCCTAGTCA 
GCCCAAAGTG 
TTTGCCAGAT 
CTCTGCCGGG 
AGGACTTGGT 
CCGAGGGAGG 
TCTTCAGGGC 
GCTCCTCCAC 



31 

1 

GTAGCTCTGG 
GACACTTATG 
GATGTGAATC 
TGGCTGGTAC 
CATATCCAGC 
GTATCCCTCC 
ACCTGGCAGA 
CAGAAACTCT 
CCCCAGGGAA 
ATTTGGTATA 
TTACTCTTCA 
CAGGTTGGCT 
CTACTCAAGA 
ACAGTGAAGC 
GCTGGGCCAG 
ATGGTGGTTT 
GTCTACGAAG 
GTGGCCATCT 
AACTACTCTG 
AACTACGCCC 
AAAAGTGTCT 
GTCCTTGCCT 
TCCGCCTACC 
GCATCTCAAG 
GGCTTCTGGT 
CATCATGCCT 
CTCAGCTCTG 
CAGACAGCTT 
TATAAGTCTA 



41 

I 

CTGTGATGGG 
GCCGTCCCAT 
TTCCCTGCAC 
AACGTGGCTC 
AGGCAAAGTA 
AATTGAGCAC 
CTCCTGATGG 
CTGTCTCCAA 
TGAGGATTAG 
AGCAACAGAC 
AGCCTGCGGT 
CTGAGCAGCA 
CCAAGACTGA 
AGTCCTGGGA 
GAAAGAGCCT 
TTACCATGGC 
CAGCCAGGGC 
TCGCAAGTGG 
ATGAGCCCTG 
GCCTGCTGGA 
GTTAAAAATG 
TCTGCATGGC 
AACACTGGAG 
TAAGCCAGCT 
ACTCCTCTCT 
ACAGACACTA 
CCAGCTCAGA 
TTAATTGAAA 
ATGTTCTGAC 



51 

I 

GATCTTACTG 
CCTGGAAGTG 
CTATGACCCC 
AGACCCTGTC 
CCAGGGCCGC 
CCTGGAGATG 
CAACCAAGTC 
GCCCACAGTG 
CCTTCAATGC 
TAATAACCAG 
GATAGCCGAC 
CAGCGACATT 
GGCACCTACA 
CTGGACCACT 
GCCTGTCTTT 
CTATATCATG 
ACATGCCAGA 
CTGCTCCAGT 
CATAGGACAG 
CACAGTTCCT 
CCCCATTAGG 
CTTCTTCCCT 
CCGCTGGGAG 
GCTGGATTTG 
AAATACCAGA 
TTCAACTTTG 
GGACCAGCTA 
TTGTTATTTC 
TCTCTCCTGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
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TGCTCAATAA ATATCTAATC ATAACAGCAA AAAAAAAAAA AAAAAAA 

Seq ID NO: 225 Protein sequence t 
Protein Accession #: NP_009199.1 

1 11 21 31 41 51 

I I I I I 1 

MGILT/jIiItTiTj GHLTVDTYGR PILEVPESVT GPWKGDVNLP CTYDPLQGYT QVLVKWLVQR 60 
GSDPVTIFLR DSSGDHIQQA KYOGRLHVSH KVPGDVSLQL STLEMDDRSH YTCEVTWQTP 120 
DGNQWRDKI TKLRVOKLSV SKPTVTTGSG YGFTVPQGMR ISLQCQARGS PPISYIWYKQ 180 
QTNNQEPIKV ATLSTLLPKP AVIADSGSYF CTAKGQVGSE QHSDIVKFW KDSSKLLKTK 240 
TEAPTTMTYP LKATSTVKQS WDWTTDMDGY LGETSAGPGK SLPVFAIILI ISLCCMWFT 300 
MAYIMLCRKT SQQEHVYEAA RAHAREANOS GETMRVAIFA SGCSSDEPTS QNLGNNYSDE 360 
PCIGQEYQII AQINGNYARL LDTVPLDYEF LATEGKSVC 



Seq ID NO: 226 DNA sequence 

Nucleic Acid Accession #: XM_64321 

Coding sequence: 1-2079 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I I 

ATGGTCGCCA GTTCCGATCA AGACAGAGCC CCGTATCTTC CAGGGACACT AGACAAGATG 60 

CCAGGACCAC GCCTCCGCTC TGCCCAGAGG CCAAAAGCAG CCCAACAAGA GCCOGGCATT 120 

GAGCCTGGTA CTTACAGGGA GGGTGGTGGA GCCATCGTCC TCACGTATGC GCTGGGGATC 180 

GGGGTTGGGA TCACGGGAAA CACAGTTCAA CAACCACCTC AACTCACTGA CTCCGCCAGC 240 

ATCCGTCAGG AGGATGCCTT TGATAACAAA ATTGACATTG CTGAAGATGG TGGCCAGACA 300 

CCATACGAAG CTACCTTGCA GCAAAGCTTT CAATACTCAC CTACAACAGA TCTTCCTCCA 360 

CTCACAAATG GCTACCTGCC ATCAATCAGC ATGTATGAAA TTCAAACCAA ATACCAGTCG 420 

CATAATCAAT ATCCTAATGG AAATTCTAAA CAGAAGACCA CATTAAATTC TAGAAAACCC 480 

TTCCCCTCCA CAGCCACCAC TTCGGTACCA CAAACTGTGA TTCCAAAGAA GAGTGGCTCA 540 

CCTGAAGTTA AACTAAAAAT AACCAAAACT ' ATCCAGAATG GCAGGGAATT GTTCAAGTCT 600 

TCCCTTTGTG GAGACCTTTT AAATGAAGTA CAGGCAAGTG AGCACACGAA GTCAAAGCAT 660 

GAAAGCAGAA AAGAAAAGAG GAAAAAACCC AAAAAGCATG ACTCATCAAG ATCTGAAGAG 720 

CGCAAGTCAC ACAAAATCCC CAAATTAGAA CCAGAGGAAC AAAATAGACC AAATGAGAGG 780 

GTTCACACCA TATCAGAAAA ACCAAGGGAA GATCCAGTAC TAAAAGAGGA AGCCCCAGTT 840 

CAGCCAATAC TATCTTCTGT TCCAACAACA GAAGTGTCCA CTGGTGTTAA GTTTCAAGTT 900 

GGTGATCTTG TGTGGTCCAA GGTGACGGTC ACACCCTGTT GGGTGCCCOG CCTGCGAGGA 960 

CGGAGGAGCC ATCACTGTTC CAGCTGCCTG GAGATCTTGG TGCTGGTGCC AGCCCTCAGC 1020 

CTCAAGAGGT CTTTCATGGT TTCTTCCTTG AAGTTCCTCA CCTCCACGGG CAAACAGAAG 1080 

CCCACATTCA AGGGAACTGC CCAGATGGGC TGGTCACCTA TGGCCTCCAC GACCAATGTC 1140 

TCCCTGCTCC TTGGTCATTG GGAAGGAACA GACCAGATGT CATCCAGGGG CCCGGAATTT 1200 

GGGGGGCGCC GCTGGGTGTG GCAGCATCAG AAGCCTCAGA TCCGCATCTC CATCTGCCAC 1260 

AGGCCAGGGA AGGAACCTCT GAGACTCAGT TTCCTACGAT GTGAAGTGGA GAGAAGAATC 1320 

TCCTCTTTAG CCACCTCTCA GGGCTGCTGG TGTTCGCCCC CAGACCACGT CTGTGAGAAA 1380 

TGCTTAGAAG ACTATGCAGG GCGCCGCCAT TTGACACTCA GAGCCCAGGA AGCCTTTCTT 1440 

GGTCCAGACA GCAGGACTGG AAGCCTTAGA GCTGTCGGCA AGAGATACTG CAGGAACAGC 1500 

CAGCACCAGA GATATCTCCT GCAAGGCCTC CTAGGTGGGT TCTTGGAAGA AAGGAATGCC 1560 

AATGAATATG ATTGCAAGCT AGAGACGAGA GAAGCGGCGT CCTCAACTCC AAGAATCCCG 1620 

TATTCCCCAA CCCACATCCT TCAGTCTGAA AGTGCCCCTA ACCACTACTT TCCCTACCAC 1680 

GTCTCCCTTT CCAAGTTCCT CAAACGCAAA GCAAACAGCC ATTTCCTGCA CCTGTGTGCA 1740 

GTCGTAGCAG TACGTAGGAG ATCCAATATG CCTGGCACAA GGGGGTGGGG TGGCCACAAA 1800 

CAGAAGCAGC CCTGTCCTGC CAAGTACAGG CCTGCCTGCC ACGCACAATG GGAGACATTC 1860 

CGCAAGTTCC ACGTGATGGC TCAGAAGAGG GGCCTGTCAG GAAGATGTAG GGGCCAGCAG 1920 

CCCCCGGCCG CGCCCCGCAA GGTGGCTGAC AGACGCCAGC AGCTGCCGGG GGCTCCGGGC 1980 

TGCTCCTGCT CCCAGGATGT GTATCTGACT GGAGTTTCTG GATTAAAGGC CAGTCGTGGC 2040 
TTCATTCCAC ATCCCTGGGT GCCCTTCGGC TCCTCCTAG 



Seq ID NO: 227 Protein sequence: 
Protein Accession #: XP_064321.1 

1 11 21 31 41 51 

I I.I I I I 

MVASSDQDRA PYLPGTLDKM PGPRLRSAQR PKAAQQEPGI EPGTYREGGG AIVLTYALGI 60 

GVGITGNTVQ QPPQLTDSAS IRQEDAFDNK IDIAEDGGQT PYEATLQQSF QYSPTTDLPP 120 

LTNGYLPSIS MYEIOTKYQS HNQYPNGNSK QKTTI^SRKP FPSTATTSVP QTVIPKKSGS 180 

PEVKIiKITKT IQNGRELFKS SLCGDLLNEV QASEHTKSKH ESRKEKRKKP KKHDSSRSEE 240 

RKSHKIPKLE PEEQNRPNER VHTISEKPRE DPVLKEEAPV QPILSSVPTT EVSTGVKFQV 300 

GDLVWSKVTV TPCWVPRLRG RRSHHCSSCL EILVLVPALS LKRSFMVSSL KFLTSTGKQK 360 

PTFKGTAQMG WSPMASTTNV SLLLGHWEGT DQMSSRGPEF GGRRWVWQHQ KPQIRISICH 420 

RPGKEPLRLS FLRCEVERRI SSLATSOGCW CSPPDHVCEK CLEDYAGRRH LTLRAQEAFL 480 

GPDSRTGSLR AVGKRYCRNS QHQRYLLQGL LGGFLEERNA NEYDCKLETR EAASSTPRIP 540 

YSPTHILQSE SAPNHYFPYH VSLSKFIiKRK ANSHFLHIiCA WAVRRRSNM PGTRGWGGHK 600 

QKQPCPAKYT PACHAQWETF RKFHVMAQKR GLSGRCRGQQ PPAAPRKVAD RRQQLPGAPG 660 
CSCSQDVYLT GVSGLKASRG FIPHPWVPFG SS 
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Seq ID NO: 228 DNA sequence 

Nucleic Acid Accession #: NMJ)06033 

Coding sequence: 253-1752 (underlined sequences correspond to start and stop codons) 



AGCAGCGAGT 
CCGTTGACAC 
CCTTCTCTGC 
GCTGGAAACA 
TGGCGGGGCA 
TTTGCTGCGG 
AAACCCAAAG 
AAGGACCCAG 
TGCAGTTTCA 
ATCTTTGAAA 
AATGTAGTTG 
AATACCAGGG 
GATTTTTCTC 
TATGCAGGCA 
CCCATGTTTG 
GATGTCCTCC 
GGCCACATTG 
GTCTTGGGAT 
GCCGTCCACC 
TGCACTGACT 
AATAGCATTG 
AAAACCCGGG 
AGTTACAAGA 
GCAGATTCCC 
ACCTTCCTGG 
GAGGGGGCCT 
CCCCGCAACC 
CAGCGGAAAC 
GAGCTCTGGT 
GTGGAGCTTC 
TATCCAAGCC 
AGCCTTGACC 
GGAGGGGACT 
TCTGTCCACA 
GCCTCTCGTG 
GAGGAACGCT 
GAGCCTCAGT 
ATCTTGCTCG 
TCTTAGCCAT 
TATTTTTCTG 
ATGTGGTAAT 
TAGACCCTCT 
CTGGGGGAGG 
CTTCTTAGTT 
GTGTTTGATC 
GCCGGTTGCC 
ATGACATATG 
TCTAAGATTT 
CTTTTCTTGT 
ACACAATATC 
GTAGGGGGAG 
TCAGCCACAT 
TTCCTGGAGG 
AATATGCCCC 
ATCCTAAGCC 
TTATGGCTGA 
TGTACTTAGG 
TTAAACTTCT 
GTGTGCTTCT 
TATCCCTTAT 
TTTGGAATGT 
AAAATTCATT 
TATTTTAAAT 
CATTTTTTGT 
TTTTTGGATT 
TCTGTTTTTA 



11 

I 

CCTTGCCTCC 
TCGCTCCCTG 
CTCCAGTCCC 
CCAAGAGGTG 
GGATGAGCAA 
GGAGCCCCGT 
CTACACAGAC 
AGCATGAAGG 
ACATGACAGC 
ACTGGCTGCA 
TGGTTGACTG 
TGGTGGGACA 
TCGGGAATGT 
ACTTCGTGAA 
AAGGGGCCGA 
ACACCTACAC 
ACATCTACCC 
CAATTGCATA 
TCTTTGTTGA 
CCAATCGCTT 
GCTACAATGC 
CAGGCATGCC 
ACATGGGAGA 
AGACTCTGCC 
TCTACACCGA 
CTCAGTCTTG 
CCGGACGGGA 
TGACATTTTG 
TTCGCAAGTG 
CCTGAGGGTG 
CATGGAGGAA 
CTGGAGCACT 
GCGCTGCTAT 
CCTCCAGAGC 
CACACTGGAT 
GGCTCCGAAG 
GAGAAGTCCT 
GGCCCTAGCT 
TCCGTCCTGC 
TTCATTTTTT 
GGACATATTA 
TTCTGTTTGG 
CTATAGGATA 
ATTATGTGCC 
ACTAGCAGAA 
AGATATAACT 
TAATACACAT 
AATACAGTGC 
TGAACTTCTT 
TGAGACACTT 
AAGGGCAACT 
GCCTAGACTT 
CAAAGTCTAT 
AGACGTGAGA 
CTTTTATTAA 
GATTCGGGAG 
TTTTCTAAGG 
GAAGACATAA 
CTTGTTTCTG 
CCAAAATGCT 
TTGCATATAC 
GATGTGTCAG 
AGTTGTGTAC 
CTTGTTGGTG 
AGGGTTGCTC 
CTAATGGAAG 



21 
I 

CGGCGGCTCA 
CCACCGCCCG 
CCAGCCCCTG 
GTTTTTGTTT 
CTCCGTTCCT 
ACCTTTTGGT 
TGAGGTCAAA 
ATGCTACCTC 
TAAAACCTTT 
CAAACTCGTG 
GCTCCCCCTG 
CAGCATTGCC 
CCACTTGATC 
AGGAACGGTG 
CATCCACAAG 
GCGTTCCTTC 
CAATGGGGGT 
TGGAACAATC 
CTCTCTGGTG 
CAAAAAGGGG 
CAAGAAAATG 
TTTCAGAGTT 
AATTGAGCCC 
ACTGGAAATA 
GGAGGACTTG 
GTACAACCTG 
GCTGAATATC 
TACAGAAGAC 
TCGGGATGGC 
CCCGGGCAAG 
AGTTACTGCT 
GGGAACAACT 
AGCTCTTGCT 
ACCAAGTCCA 
TGGTTTCTCA 
AGGCCCTGTG 
TCCGACAGGA 
GTTGGGGTTC 
TCCCCAGCTC 
AATTGAGCAA 
CTGAGCCTCT 
ATGGTGTATG 
TAAGCATTA6 
ACCTTCTTAG 
TAGCAAGCAG 
GCTTTGGAGC 
CTGTGTACAC 
TTTTTTTCCT 
GGAAAAGCCA 
ACACTTTTCA 
ATTATTATCC 
ATATACTAGT 
CTCTGAAACT 
CAAACAAGGA 
TGTATAACCA 
GAAGTGTGAC 
ACATTGTTTT 
CCAGTTGAGT 
TGATTGCTTT 
TGGAACCAGA 
ATAATGAGAT 
TTACACCTTA 
ATGAAGCATG 
CTCAAAAAGT 
AACCCATATT 
CTTTGCA 



31 

I 

GGACGAGGGC 
GGCTCCGTGC 
GCCGAGAGAA 
TTTAAAACTT 
CTGCTCTGTT 
CCAGAGGGAC 
CCATCTGTGA 
TCCGTCGGCC 
TTCATCATTC 
TCAGCCCTGC 
GCCCACCAGC 
AGGATGCTCG 
GGCTACAGCC 
GGCCGAATCA 
AGGCTCTCTC 
GGCTTGAGCA 
GACTTCCAGC 
ACAGAGGTGG 
AATCAGGACA 
ATCTGTCTGA 
AGGAACAAGA 
TACCATTATC 
ACCTTTTACG 
GTGGAGCGGA 
GGAGACCTCT 
TGGAAGGAGT 
AGGCGCATCC 
CCTGAGAACA 
TGGAGGATGA 
TCTTGCCAGC 
GAGGACCCAC 
GGTCTCCTGT 
GCCTCTCTTG 
GATTTGTGTG 
GTTGCTGGGC 
TAGAAGGCTG 
GCTGACTCAT 
TCATGGGTTG 
ACTCTCTGAA 
ATGTCTATTG 
CCATTTGGAA 
TGTATATGCA 
GGACCCTGAG 
TTATTATGTG 
AGTATCATTC 
AAATCTCTTC 
AGAAACOGGC 
CTTTGAAATA 
CCAATTCTAG 
AAAGATTTGT 
CTATTTTACA 
TAGTGGTGCA 
CCATGAAGAC 
CTTTTTTTTT 
GGAGAACATC 
ACCAAGCAGG 
AATCTGTATC 
CTTATTTCAA 
CTAGCCAAAG 
AGTGTTTCAA 
ATTTTGGGAA 
TCCACATAGC 
GTTTGTGGTA 
TTTGGATTTT 
ATTGGCTGTA 



41 

I 

AGATCTCGTT 
CGCCAAGTTT 
GGGTCTTACC 
CTGTTTCTTG 
TCTGGAGCCT 
GGCTGGAAGA 
GGTTTAACCT 
ACAGCCAGCC 
ACGGATGGAC 
ACACAAGAGA 
TTTACACGGA 
ACTGGCTGCA 
TCGGAGCGCA 
CAGGTTTGGA 
CGGACGATGC 
TTGGTATTCA 
CAGGCTGTGG 
TAAAATGTGA 
AGCCGAGTTT 
GCTGCCGCAA 
GGAACAGCAA 
AGATGAAAAT 
TCACCCTTTA 
TCGAGCAGAA 
TGAAGATCCA 
TTOGCAGCTA 
GGGTGAAGTC 
CCAGCATATC 
AAAACGAAAC 
AAGGCAGCAA 
CCAATGGAAG 
GATGGCTGGG 
AATAGCTCTA 
TAAGCAGCTG 
GAGCCTGTAC 
TCAGCTGCTC 
GTCAGGATGG 
CACTGACCAT 
GCACACATCA 
AACACTTAAA 
CCCAGTGGAG 
TGGGGAAAGG 
GCTTTAAGTG 
CCACCTCCCC 
ATGCTGGGGC 
TGTTTAGAGA 
ACCTGCCAGA 
TTTTACTTTA 
ATCTTGATTT 
GTATGCATTG 
AAACTGAGGC 
GCCAGGGAGA 
TTTTGCAGCC 
TATATAGAGC 
TGTGCCAACG 
AGAGGAAGAA 
GTGCCAAAGT 
GATATGTTCT 
CGAAGCTTGT 
ATTTTAGATT 
TAGGACCCGA 
CTGAGGGTAA 
ACTTATGTGA 
GGAGCATTTC 
CATCCTGGTC 



51 
I 

CTGGGGCAAG 
TCATTTTCCA 
GGCCGGGATT 
GGAGGGGGTG 
CTGCTATTGC 
TAAGCTCCAC 
CCGCACCTCC 
CTTAGAAGAC 
GATGAGCGGT 
GAAAGACGCC 
TGCGGTCAAT 
GGAGAAGGAC 
CGTGGCCGGG 
TCCTGCCGGG 
AGATTTTGTG 
GATGCCTGTG 
ACTCAACGAT 
GCATGAGCGA 
TGCCTTCCAG 
GAACCGTTGT 
AATGTACCTA 
CCATGTCTTC 
TGGCACTAAT 
TGCCACCAAC 
GCTCACCTGG 
CCTGTCTCAA 
TGGGGAAACC 
CCCAGGCCGG 
CAGTCCCACT 
GACTTCCTGC 
GATTCTTCTC 
ACTCCTCGCG 
ACTCCAAACC 
GGTGCCTGGG 
TCTGCCTGAC 
AGCCTGCTTT 
CAGGCCTGGT 
ACTGCTTACG 
TTGGCTTTCC 
ATTAATTAGA 
TTGGGATTTC 
CACCTGGGGC 
GTTTCTATTT 
TATGAGTGAC 
CAGAATGATG 
GATAGAAGTT 
CAGAGCTGGT 
ATACCAGTGC 
GAATTAATAC 
CCTAATTAGA 
TTAGTGAGGT 
GGACTCAGAT 
AGTTCCCACC 
CATCCATAAA 
GTTGGACTTT 
T6ATTTTCTT 
TGTATCACTG 
CAAGCCAATT 
ACAGGTTGAG 
ATTTTCAGAT 
GCCTAAACAC 
TTTTATACGA 
GGGGTTTTCC 
GGATTTTGGA 
ACTTCTGACT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 



Seq ID NO: 229 Protein sequence: 
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Protein Accession #: 2JPJ)06024.1 

} «• 21 31 41 51 

I I I I I I 

MSNSVPLLCP WSLCYCFAAG SPVPFGPEGR LEDKLHKPKA TQTEVKPSVR FNLRTSKDPE 60 

HEGCYXiSVGH SQPLEDCSFH MTAKTFFIIH GWTMSGIFEN WLHKLVSALH TREKDANVW 120 

VDWLPLAHQL YTDAVNNTRV VGHSIARMLD WLQBKDDFSL GNVHLIGYSL GAHVAGYAGN 180 

FVKGTVGRIT GLDPAGPMFE GADIHXRLSP DDADPVDVLH TYTRSFGLSI GIQMPVGHID 240 

IYPNGGDFQP GCGLNDVLGS 1AYGTITEW KCEHERAVHL FVDSLVNQDK PSFAFQCTDS 300 

NRFKKGICLS CRKNRCNSIG YNAKKMRNKR NSKMYLKTRA GMPFRVYHYQ MKIHVFSYKN 360 

MGEIEPTFYV TLYGTNADSQ TLPLBIVERI EQNATNTFLV YTEEDLGDLL KIQLTWEGAS 420 

QSWYNLWKBF RSYLSQPRNP GRELNIRRIR VXSGBTQRKL TFCTEDPENT SISPGRELWF 480 
RKCRDGWRMK NETSPTVELP 



9R7 
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It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
5 application were specifically and individually indicated to be incorporated by reference. 
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WHATES OAIMEPiS; 

1 1 . A method of detecting an angiogenesis-associated transcript in a cell in 

2 a patient, the method comprising contacting a biological sample from the patient with a 

3 polynucleotide that selectively hybridized to a sequence at least 80% identical to a sequence 

4 as shown in Tables 1 -8. 

1 2. The method of claim 1 , wherein the biological sample is a tissue 

2 sample. 

1 3. The method of claim 1, wherein the biological sample comprises 

2 isolated nucleic acids. 

1 4. The method of claim 3, wherein the nucleic acids are mRNA. 

1 5 . The method of claim 3, further comprising the step of amplifying 

2 nucleic acids before the step of contacting the biological sample with the polynucleotide. 

1 6. The method of claim 1 , wherein the polynucleotide comprises a 

2 sequence as shown in Tables 1 -8 . 

1 7. The method of claim 1 , wherein the polynucleotide is labeled. 

1 8. The method of claim 7, wherein the label is a fluorescent label. 

1 9. The method of claim 1 , wherein the polynucleotide is immobilized on 

2 a solid surface. 

1 1 0. The method of claim 1 , wherein the patient is undergoing a therapeutic 

2 regimen to treat a disease associated with angiongenesis. 

1 11. The method of claim 1 f wherein the patient is suspected of having 

2 cancer. 

1 1 2. An isolated nucleic acid molecule consisting of a polynucleotide 

2 sequence as shown in Tables 1 -8. 

1 1 3. The nucleic acid molecule of claim 12, which is labeled. 

1 14. The nucleic acid of claim 13, wherein the label is a fluorescent label 
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1 1 5 . An expression vector comprising the nucleic acid of claim 1 2. 

1 1 6. A host cell comprising the expression vector of claim 1 5. 

1 1 7. An isolated polypeptide which is encoded by a nucleic acid molecule 

2 having polynucleotide sequence as shown in Tables 1-8 

1 1 8. An antibody that specifically binds a polypeptide of claim 1 7. 

.1 19. The antibody of claim 18, further conjugated or fused to an effector 

2 component. 

1 20. The antibody of claim 19, wherein the effector component is a 

2 fluorescent label. 

1 21 . The antibody of claim 19, wherein the effector component is a 

2 radioisotope. 

1 22. The antibody of claim 19, which is an antibody fragment. 

1 23. The antibody of claim 19, which is a humanized antibody 

1 24. A method of detecting a cell undergoing angiogenesis in a biological 

2 sample from a patient, the method comprising contacting the biological sample with an 

3 antibody of claim 1 8. 

1 25. The method of claim 24, wherein the antibody is further conjugated or 

2 fused to an effector component. 

1 26. The method of claim 25, wherein the effector component is a 

2 fluorescent label. 

1 27. The method of detecting antibodies specific to angiogenesis in a 

2 patient, the method comprising contacting a biological sample from the patient with a 

3 polypeptide which is encoded by a nucleotide sequence of Tables 1-8. 
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